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SYSTEM FOR THE IN VIVO DELIVERY AND 
EXPRESSION OF HETEROLOGOUS GENES IN 

THE BONE MARROW 

FEDERALLY SPONSORED RESEARCH 
This invention was made with Government support under Grant 
Number 5 ROl AI22186 from the National Institutes of Health. The Government 
has certain rights to this invention. 

FIELD OF THE INVENTION 
The present invention relates to recombinant DNA technology, and in 
particular to introducing and expressing foreign DNA in a eukaryotic cell. 

BACKGROUND OF THE INVENTION 
The Alphavirus genus includes a variety of viruses all of which are 
members of the Togaviridae family. The alphaviruses include Eastern Equine 
Encephalitis virus (EEE) , Venezuelan Equine Encephalitis virus (VEE) , Everglades 
virus, Mucambo virus, Pixuna virus, Western Equine Encephalitis virus (WEE), 
Sindbis virus, South African Arbovirus No. 86 (S.A.AR 86), Girdwood S.A. 
virus, Ockelbo virus, Semliki Forest virus, Middelburg virus, Chikungunya virus, 
O'Nyong-Nyong virus, Ross River virus, Barmah Forest virus, Getah virus, 
Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, Aura virus, Whataroa 
virus, Babanki virus, Kyzylagach virus, Highlands J virus, Fort Morgan virus, 
Ndumu virus, and Buggy Creek virus. 
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The alphavirus genome is a single-stranded, messenger-sense RNA, 
modified at the 5'-end with a methylated cap, and at the 3*-end with a variable- 
length poly (A) tract. The viral genome is divided into two regions: the first 
encodes the nonstructural or replicase proteins (nsPl-nsP4) and the second encodes 
the viral structural proteins. Strauss and Strauss, Microbiological Rev. 58, 491- 
562, 494 (1994). Structural subunits consisting of a single viral protein, C, 
associate with themselves and with the RNA genome in an icosahedral 
nucleocapsid. In the virion, the capsid is surrounded by a lipid envelope covered 
with a regular array of transmembranal protein spikes, each of which consists of 
a heterodimeric complex of two glycoproteins, El and E2. See Paredes et al., 
Proc. Natl Acad. ScL USA 90, 9095-99 (1993); Paredes etal., Virology 187, 324- 
32 (1993); Pedersen et al., /. Virol 14:40 (1974). 

Sindbis virus, the prototype member of the alphavirus genus of the 
family Togaviridae, and viruses related to Sindbis are broadly distributed 
throughout Africa, Europe, Asia, the Indian subcontinent, and Australia, based on 
serological surveys of humans, domestic animals and wild birds. Kokernot et al., 
Trans, R. Soc. Trop Med Hyg. 59, 553-62 (1965); Redaksie, S. Afr. Med. J. 42, 
197 (1968); Adekoiu- John and Fagbami, Trans. R. Soc. Trop. Med. Hyg. 77, 149- 
51 (1983); Darwish et al., Trans. R. Soc. Trop. Med. Hyg. 77, 442-45 (1983); 
Lundstrom et al., Epidemiol Infect. 106, 567-74 (1991); Morrill et al., /. Trop. 
Med. Hyg. 94, 166-68 (1991). The first isolate of Sindbis virus (strain AR339) 
was recovered from a pool of Culex sp. mosquitoes collected in Sindbis, Egypt in 
1953 (Taylor et al., Am. J. Trop. Med. Hyg. 4, 844-62 (1955)), and is the most 
extensively studied representative of this group. Other members of the Sindbis 
group of alphaviruses include South African Arbovirus No. 86. Ockelbo82, and 
Girdwood S.A. These viruses are not strains of the Sindbis virus; they are related 
to Sindbis AR339, but they are more closely related to each other based on 
nucleotide sequence and serological comparisons. Lundstrom et al., /. Wildl Dis. 
29, 189-95 (1993); Simpson et al., Virology 111, 464-69 (1996). Ockelbo82, 
S.A.AR86 and Girdwood S.A. are all associated with human disease, whereas 
Sindbis is not. The clinical symptoms of human infection with Ockelbo82, 
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S.A.AR86, or Girdwood S.A. are a febrile illness, general malaise, macropapular 
rash, and joint pain that occasionally progresses to a polyarthralgia sometimes 
lasting from a few months to a few years. 

The study of these viruses has led to the development of beneficial 
techniques for vaccinating against the alphavirus diseases, and other diseases 
through the use of alphavirus vectors for the introduction of foreign DNA. See 
United States Patent No. 5,185,440 to Davis et al., and PCT Publication WO 
92/10578. It is intended that all United States patent references be incorporated 
in their entirety by reference. 

It is well known that live, attenuated viral vaccines are among the 
most successful means of controlling viral disease. However, for some virus 
pathogens, immunization with a live virus strain may be either impractical or 
unsafe. One alternative strategy is the insertion of sequences encoding immunizing 
antigens of such agents into a vaccine strain of another virus. One such system 
utilizing a live VEE vector is described in United States Patent No. 5,505,947 to 
Johnston et al. 

Sindbis virus vaccines have been employed as viral carriers in virus 
constructs which express genes encoding immunizing antigens for other viruses. 
See United States Patent No. 5,217,879 to Huang et al. Huang et al. describes 
Sindbis infectious viral vectors. However, the reference does not describe the 
cDNA sequence of Girdwood S.A. and TR339, nor clones or viral vectors 
produced therefrom. 

Another such system is described by Hahn et al., Proc. NatL Acad. 
Sci. USA 89:2679 (1992), wherein Sindbis virus constructs which express a 
truncated form of the influenza hemagglutinin protein are described. The 
constructs are used to study antigen processing and presentation in vitro and in 
mice. Although no infectious challenge dose is tested, it is also suggested that 
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such constructs might be used to produce protective B- and T-cell mediated 
immunity. 

London et al., Proc. Natl. Acad. Sci, USA 89, 207-11 (1992), 
disclose a method of producing an immune response in mice against a lethal Rift 
Valley Fever (RVF) virus by infecting the mice with an infectious Sindbis virus 
containing an RVF epitope. London does not disclose using Gixdwood S.A. or 
TR339 to induce an immune response in animals. 

Viral carriers can also be used to introduce and express foreign 
DNA in eukaryotic cells. One goal of such techniques is to employ vectors that 
target expression to particular cells and/or tissues. A current approach has been 
to remove target cells from the body, culture them ex vivo, infect them with an 
expression vector, and then reintroduce them into the patient. 

PCT Publication No. WO 92/10578 to Garoff and Liljestrom 
provide a system for introducing and expressing foreign proteins in animal cells 
using alphaviruses. This reference discloses the use of Semliki Forest virus to 
introduce and express foreign proteins in animal cells. The use of Girdwood S.A. 
or TR339 is not discussed. Furthermore, this reference does not provide a method 
of targeting and introducing foreign DNA into specific cell or tissue types. 

Accordingly, there remains a need in the art for full-length cDNA 
clones of positive-strand RNA viruses, such as Girdwood S.A and TR339. In 
addition, there is an ongoing need in the an for improved vaccination strategies. 
Finally, there remains a need in the art for improved methods and nucleic acid 
sequences for delivering foreign DNA to target cells. 



SUMMARY Off THE TNVF.NTTnM 
A first aspect of the present invention is a method of introducing 
and expressing heterologous RNA in bone marrow cells, comprising: (a) providing 
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a recombinant alphavirus, the alphavirus containing a heterologous RNA segment 
the heterologous RNA segment comprising a promoter operable in bone marrow 
cells operatives associated with a heterologous RNA to be expressed in bone 
marrow cells; and then (b) contacting the recombinant alphavirus to the bone 
marrow cells so that the heterologous RNA segment is introduced and expressed 



therein. 



As a second aspect, the present invention provides a helper cell for 
expressing an infectious, propagation defective, Girdwood S.A. virus particle, 
comprising, in a Girdwood S.A.-permissive cell: (a) a first helper RNA encoding 
(i) at least one Girdwood S.A. structural protein, and (ii) not encoding at least one 
other Girdwood S.A. structural protein; and (b) a second helper RNA separate 
from the first helper RNA, the second helper RNA (i) not encoding the at least one 
Girdwood S ; A. structural protein encoded by the first helper RNA, and (ii) 
encoding the at least one other Girdwood S.A. structural protein not encoded by 
the first helper RNA, and with all of the Girdwood S.A. structural proteins 
encoded by the first and second helper RNAs assembling together into Girdwood 
S.A. panicles in the cell containing the replicon RNA; and wherein the Girdwood 
S.A. packaging segment is deleted from at least the first helper RNA. 

A third aspect of the present invention is a method of making 

infectious, propagation defective, Girdwood S.A. virus particles, comprising: 

transfecting a Girdwood S.A.-permissive cell with a propagation defective replicon 

RNA, the replicon RNA including the Girdwood S.A. packaging segment and an 

inserted heterologous RNA; producing the Girdwood S.A. virus panicles in the 

transfected cell; and then collecting the Girdwood S.A. virus panicles from the 

cell. Also disclosed are infectious Girdwood S.A. RNAs, cDNAs encoding the 

same, infectious Girdwood S.A. virus particles, and pharmaceutical formulations 
thereof. 

As a fourth aspect, the present invention provides a helper cell for 
expressing an infectious, propagation defective, TR339 virus particle, comprising, 
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in a TR339-pennissive cell: (a) a first helper RNA encoding (i) at least one TR339 
structural protein, and (ii) not encoding at least one other TR339 structural protein; 
and (b) a second helper RNA separate from the first helper RNA, the second 
helper RNA (i) not encoding the at least one TR339 structural protein encoded by 
the first helper RNA, and (ii) encoding the at least one other TR339 structural 
protein not encoded by the first helper RNA. and with all of the TR339 structural 
proteins encoded by the first and second helper RNAs assembling together into 
TR339 particles in the cell containing the replicon RNA; and wherein the TR339 
packaging segment is deleted from at least the first helper RNA. 

A fifth aspect of the present invention is a method of making 
infectious, propagation defective, TR339 virus panicles, comprising: transfecting 
a TR339-permissive cell with a propagation defective replicon RNA, the replicon 
RNA including the TR339 packaging segment and an inserted heterologous RNA; 
producing the TR339 virus particles in the transfected cell; and then collecting the 
TR339 virus particles from the cell. Also disclosed are infectious TR339 RNAs, 
cDNAs encoding the same, infectious TR339 virus particles, and pharmaceutical 
formulations thereof. 

As a sixth aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for. an infectious Girdwood S.A. virus RNA 
transcript, and a heterologous promoter positioned upstream from the cDNA and 
operatively associated therewith. The present invention also provides infectious 
RNA transcripts encoded by the above-mentioned cDNA and infectious viral 
particles containing the infectious RNA transcripts. 

As a seventh aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for a Sindbis strain TR339 RNA transcript, and 
a heterologous promoter positioned upstrearn from the cDNA and operatively 
associated therewith. The present invention also provides infectious RNA 
transcripts encoded by the above-mentioned cDNA and infectious viral particles 
containing the infectious RNA transcripts. 
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The foregoing and other aspects of the present invention are 
described in the detailed description set forth below. 

BRIEF PFSCRTPTT QIST Qp TFTR DRAWTTCfig 
Figure 1 presents the cDNA sequence (SEQ n> NO:l) of 
S.A.AR86. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome was sequenced by RT- 
PCR of fragments amplified from virion RNA. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7559 (nsPl--nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
nt4100 through nt5729; nsP4-nt5730 through nt7559), the structural polyprotein 
is encoded by nucleotides 7608 through 1 1342 (capsid--nt7608 through nt8399; E3- 
-nt8400 through nt8591; E2~nt8592 through nt9860; 6K-nt9861 through ntl0025; 
El-ntl0026 through ntll342). and the 3' UTR is represented by nucleotides 
11346 through 11663. 

Figure 1A shows nucleotides 1 through 3800 of the cDNA sequence 

ofS.A.AR86. 

Figure IB shows nucleotides 3801 through 7900 of the cDNA 
sequence of S.A.AR86. 

Figure 1C shows nucleotides 7901 through 11663 of the cDNA 
sequence of S.A.AR86. 

Figure 2 presents the putative amino acid sequences of the 
S.A.AR86 polyproteins (SEQ ID NO:2 and SEQ ID NO:3). The amino acids 
were derived from the S.A.AR86 cDNA sequence given in Figure 1 (SEQ ID 
NO:l). 
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Figure 2A shows the amino acid sequence of the non-structural 
polyprotein of S.A.AR86 (SEQ ID NO:2). 

Figure 2B shows the amino acid sequence of the structural 
polyprotein of S.A.AR86 (SEQ ID NO:3). 

5 Figure 3 presents the cDNA sequence (SEQ ID NO:4) of Girdwood 

S.A. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. Hie rest of the genome sequence was obtained 
by sequencing of fragments amplified by RT-PCR from virion RNA. An "N" in 
the sequence indicates that the identity of the nucleotide at that position is 

10 unknown. Nucleotides 1- through 59 represent the 5' UTR, the non-structural 
polyprotein is encoded by nucleotides 60 through 7613 (nsPl-nt60 through 
ntl679; nsP2-ntl680 through nt4099; nsP3--nt4100 through nt5762 or nt5783; 
nsP4-nt5784 through nt7613), the structural polyprotein is encoded by nucleotides 
7662 through 11396 (capsid-nt7662 through nt8453; E3--nt8454 through nt8645; 

15 E2-nt8646 through nt9914, 6K-9915 through ntl0079; El--ntl0080 through 
ntll396), and the 3' UTR is represented by nucleotides 11400 through 11717. 
There is an opal terminadon codon at nucleotides 5763 through 5765. 

Figure 3 A shows nucleotides 1 through 3800 of the cDNA sequence 
of Girdwood S.A. 



20 



Figure 3B shows nucleotides 3801 through 7900 of the cDNA 
sequence of Girdwood S.A. 

Figure 3C shows nucleotides 7901 through 11717 of the cDNA 
sequence of Girdwood S.A. 



Figure 4 illustrates the putative amino acid sequences of the 
25 Girdwood S.A. polyproteins (SEQ ED NO:5 and SEQ ID NO:6). The amino 
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acids were derived from the Girdwood S.A. cDNA sequence given in Figure 3 
(SEQ ID NO:4). 

Figure 4A shows the amino acid sequence of the non-structural 
polyprotein of Girdwood S.A. The sequence terminates at the opal termination 
codon. The complete amino acid sequence is presented in SEQ ID NO:5. 

Figure 4B shows the amino acid sequence of the structural 
polyprotein of Girdwood S.A. (SEQ ID NO: 6). 

Figure 5 illustrates the nucleotide sequence (SEQ ID NO:7) of 
clone pS55, a cDNA clone of the S.A.AR86 genomic RNA. 

Figure 5 A shows nucleotides 1 through 6720 of the cDNA sequence 

ofpS55. 

Figure 533 shows nucleotides 6721 through 11663 of the cDNA 
sequence of pS55. 

Figure 6 presents the cDNA sequence (SEQ ID NO:8) of clone 
pTR339. The TR339 virus is derived from this clone. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7598 (nsPl-nt60 through ntl679; nsP2--ntl680 through nt4099; nsP3~ 
nt4100 through nt5747 or 5768; nsP4~nt5769 through nt7598) f the structural 
polyprotein is encoded by nucleotides 7647 through 11381 (capsid--nt7647 through 
nt8438; E3-nt8439 through nt8630; E2--nt8631 through nt9899; 6K»nt9900 
through ntl0064; El»ntl0065 through ntll381), and the 3' UTR is represented 
by nucleotides 11382 through 11703. There is an opal termination codon at 
nucleotides 5748 through 5750. 

Figure 6A shows nucleotides 1 through 6720 of the cDNA sequence 

of pTR339. 
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Figure 6B shows nucleotides 6721 through 11703 of the cDNA 
sequence of pTR339. 

DETAILED DESCRIPTION OF THE INVKNTTOM 
The production and use of recombinant DN A , vectors, transformed 
5 host cells, selectable markers, proteins, and protein fragments by -genetic 
engineering are well-known to those skilled in the art. See, e.g., United States 
Patent No. 4,761,371 to Bell et al. at Col. 6 line 3 to Col. 9 line 65; United States 
Patent No. 4,877, 729 to Clark et al. at Col. 4 line 38 to Col. 7 line 6; United 
States Patent No. 4,912,038 to Schilling at Col 3 line 26 to Col 14 line 12; and 
10 United States Patent No. 4,879,224 to Wallner at Col. 6 line 8 to Col. 8 line 59. 

The term "alphavirus" has its conventional meaning in the an, and 
includes the various species of alphaviruses such as Eastern Equine Encephalitis 
virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades virus, 
Mucambo virus, Pixuna virus, Western Encephalitis virus (WEE), Sindbis virus, 
South African Arbovirus No. 86, Girdwood S.A. virus, Ockelbo virus, Semliki 
Forest virus, Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross 
River virus, Barmah Forest virus, Getah virus, Sagiyama virus, Bebaru vims, 
Mayaro virus, Una virus, Aura virus, Whataroa virus, Babanki virus, Kyzlagach 
virus, Highlands J virus. Fort Morgan virus, Ndumu virus. Buggy Creek vims, 
and any other virus classified by the International Committee on Taxonomy of 
Viruses (ICTV) as an alphavirus. The preferred alphaviruses for use in the present 
invention include Sindbis virus strains (e.g. , TR339), Girdwood S.A., S.A.AR86, 
and Ockelbo82. 

An "Old World alphavirus" is a virus that is primarily distributed 
25 throughout the Old World. Alternately stated, an Old World alphavirus is a virus 
that is primarily distributed throughout Africa, Asia, Australia and New Zealand, 
or Europe. Exemplary Old World viruses include SF group alphaviruses and SIN 
group alphaviruses. SF group alphaviruses include Semliki Forest virus, 
Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, 
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Barmah Forest vims, Getah vims, Sagiyama vims, Bebam vims, Mayaro vims, 
and Una vims. SIN group alphavimses include Sindbis vims, South African 
Arbovirus No. 86, Ockelbo vims, Girdwood S.A. vims, Aura vims, Whataroa 
vims, Babanki vims, and Kyzylagach vims. 

Acceptable alphavimses include those containing attenuating 
mutations. The phrases "attenuating mutation" and "attenuating amino acid," as 
used herein, mean a nucleotide sequence containing a mutation, or an amino acid 
encoded by a nucleotide sequence containing a mutation, which mutation results 
in a decreased probability of causing disease in its host (i.e., a loss of virulence), 
in accordance with standard terminology in the art, whether the mutation be a 
substitution mutation or an in-frame deletion mutation. See, e.g., B. DAVIS ET 
AL. t MICROBIOLOGY 132 (3d ed. 1980). The phrase "attenuating mutation- 
excludes mutations or combinations of mutations which would be lethal to the 



vims. 



Appropriate attenuating mutations will be dependent upon the 
alphavirus used. Suitable attenuating mutations within the alphavirus genome will 
be known to those skilled in the art. Exemplary attenuating mutations include, but 
are not limited to, those described in United States Patent No. 5,505,947 to 
Johnston et al. , copending United States application 08/448,630 to Johnston et al. , 
and copending United States application 08/446,932 to Johnston et al. It is 
intended that all United States patent references be incorporated in their entirety 
by reference. 

Attenuating mutations may be introduced into the RNA by 
performing site-directed mutagenesis on the cDNA which encodes the RNA, in 
accordance with known procedures. See, Kunkel, Proc. Natl. Acad. Sci. USA 82, 
488 (1985), the disclosure of which is incorporated herein by reference in its 
entirety. Alternatively, mutations may be introduced into the RNA by replacement 
of homologous restriction fragments in the cDNA which encodes for the RNA, in 
accordance with known procedures. 
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I. Methods for Introducing and ExnrPssin ? Heternlnpnne rn a ,•„ Rnwo 
Marrow Cells . 

The present invention provides methods of using a recombinant 
alphavirus to introduce and express a heterologous RNA in bone marrow cells. 
Such methods are useful as vaccination strategies when the heterologous RNA 
encodes an immunogenic protein or peptide. Alternatively, such methods are 
useful in introducing and expressing in bone marrow cells an RNA which encodes 
a desirable protein or peptide, for example, a therapeutic protein or peptide. 

The present invention is carried out using a recombinant alphavirus 
to introduce a heterologous RNA into bone marrow cells. Any alphavirus that 
targets and infects bone marrow cells is suitable. Preferred alphaviruses include 
Old World alphaviruses, more preferably SF group alphaviruses and SIN group 
alphaviruses, more preferably Sindbis virus strains (e.g., TR339), S.A.AR86 
virus, Girdwood S.A. virus, and Ockelbo virus. In a more preferred embodiment, 
the alphavirus contains one or more attenuating mutations, as described 
hereinabove. 

Two types of recombinant virus vector are contemplated in carrying 
out the present invention. In one embodiment employing "double promoter 
vectors," the heterologous RNA is inserted into a replication and propagation 
competent virus. Double promoter vectors are described in United States Patent 
No. 5,505,947 to Johnston et al. With this type of viral vector, it is preferable 
that heterologous RNA sequences of less than 3 kilobases are inserted into the viral 
vector, more preferably those less than 2 kilobases, and more preferably still those 
less than 1 kilobase. In an alternate embodiment, propagation-defective "replicon 
vectors," as described in copending United States application 08/448,630 to 
Johnston et al. . will be used. One advantage of replicon viral vectors is that larger 
RNA inserts, up to approximately 4-5 kilobases in length can be utilized. Double 
promoter vectors and replicon vectors are described in more detail hereinbelow. 
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The recombinant alphaviruses of the claimed method target the 
heterologous RNA to bone marrow cells, where it expresses the encoded protein or 
peptide. Heterologous RNA can be introduced and expressed in any cell type found in the 
bone marrow. Bone marrow cells that may be targeted by the recombinant alphaviruses of 
5 the present invention include, but are not limited to , polymorphonuclear cells , hemopoietic 
stem cells (including megakaryocyte colony forming units (CFU-M), spleen colony 
forming units (CFU-S), erythroid colony forming units (CFU-E), erythroid burst forming 
units (BFU-E), and colony forming units in culture (CFU-C), erythrocytes, macrophages 
(including reticular cells), monocytes, granulocytes, megakaryoctyes, lymphocytes, 
10 fibroblasts, osteoprogenitor cells, osteoblasts, osteoclasts, marrow stromal cells, 
chondrocytes and other cells of synovial joints. Preferably, marrow cells within the 
endosteum are targeted, more preferably osteoblasts. Also preferred are methods in which 
cells in the endosteum of synovial joints {e.g. , hip and knee joints) are targeted. 

By targeting to the cells of the bone marrow, it is meant that the primary 
15 site in which the virus will be localized in vivo is the cells of the bone marrow. 
Alternately stated, the alphaviruses of the present invention target bone marrow cells, such 
that titers in bone marrow two days after infection are greater than 100 PFU/g crushed 
bone, preferably greater than 200 PFU/g crushed bone, more preferably greater than 300 
PFU/g crushed bone, and more preferably still greater than 500 PFU/g crushed bone. 
20 Virus may be detected occasionally in other cell or tissue types, but only sporadically and 
usually at low levels. Virus localization in the bone marrow can be demonstrated by any 
suitable technique known in the art, such as in situ hybridization. 

Bone marrow cells are long-lived and harbor infectious alphaviruses for a 
prolonged period of time, as demonstrated in the Examples below. These characteristics 
25 of bone marrow cells render the present invention useful not only for the purpose of 
supplying a desired protein or peptide to skeletal tissue, but also for expressing proteins or 
peptides in vivo that are needed by other cell or tissue types. 

The present invention can be carried out in vivo or with cultured bone 
marrow cells in vitro. Bone marrow cell cultures include primary cultures 
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of bone marrow cells, serially-passaged cultures of bone marrow cells, and 
cultures of immonalLzed bone marrow cell lines. Bone marrow cells may be 
cultured by any suitable means known in the art. 

The recombinant alphaviruses of the present invention carry a 
5 heterologous RNA segment. The heterologous RNA segment encodes a promoter 
and an inserted heterologous RNA. The inserted heterologous RNA may encode 
any protein or a peptide which' is desirably expressed by the host bone marrow 
cells. Suitable heterologous RNA may be of prokaryotic (e.g. , RNA encoding the 
Botulinus toxin C), or eukaryotic (e.g., RNA encoding malaria Plasmodium 

10 protein csl) origin. Illustrative proteins and peptides encoded by the heterologous 
RNAs of the present invention include hormones, growth factors, interleukins, 
cytokines, chemokines, enzymes, and ribozymes. Alternately, the heterologous 
RNAs encode any therapeutic protein or peptide. As a further alternative, the 
heterologous RNAs of the present invention encode any immunogenic protein or 

15 peptide. 

An immunogenic protein or peptide, or '* immunogen, '* may be any 
protein or peptide suitable for protecting the subject against a disease, including 
but not limited to microbial, bacterial, protozoal, parasitic, and viral diseases. For 
example, the immunogen may be an orthomyxovirus immunogen (e.g., an 

20 influenza virus immunogen, such as the influenza virus hemagglutinin (HA) 
surface protein or the influenza virus nucleoprotein gene, or an equine influenza 
virus immunogen), or a lentivirus immunogen (e.g., an equine infectious anemia 
virus immunogen, a Simian Immunodeficiency Virus (SIV) immunogen, or a 
Human Immunodeficiency Virus (HIV) immunogen, such as the HIV envelope 

25 GP1 60 protein and the HIV matrix/capsid proteins) . The immunogen may also be 
an arenavirus immunogen (e.g., Lassa fever virus immunogen, such as the Lassa 
fever virus nucleocapsid protein gene and the Lassa fever envelope glycoprotein 
gene), a poxvirus immunogen (e.g., vaccinia), a flavivirus immunogen (e.g., a 
yellow fever virus immunogen or a Japanese encephalitis virus immunogen), a 

30 filovirus immunogen (e.g., an Ebola virus immunogen, or a Marburg virus 
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immunogen), a bunyavirus immunogen (e.g., RVFV, CCHF, and SFS viruses), 
or a coronavirus immunogen (e.g., an infectious human coronavirus immunogen, 
such as the human coronavirus envelope glycoprotein gene, or a transmissible 
gastroenteritis virus immunogen for pigs, or an infectious bronchitis virus 
immunogen for chickens). 

Alternatively, the present invention can be used to express 
heterologous RNAs encoding antisense oligonucleotides. In general, "antisense" 
refers to the use of small, synthetic oligonucleotides to inhibit gene expression by 
inhibiting the function of the target mRNA containing the complementary 
sequence. Milligan, LF. et al., J. Med. Chem. 36(14), 1923-1937 (1993). Gene 
expression is inhibited through hybridization to coding (sense) sequences in a 
specific mRNA target by hydrogen bonding according to Watson-Crick base 
pairing rules. The mechanism of antisense inhibition is that the exogenously 
applied oligonucleotides decrease the mRNA and protein levels of the target gene. 
Milligan, J.F. et al., 7. Med. Chem. 36(14), 1923-1937 (1993). See also Helene, 
C. and Toulme, J., Biochim. Biophys. Acta 1049, 99-125 (1990); Cohen, J.S., 
Ed., OLIGODEOXYNUCLEOTIDES AS ANTISENSE INHIBITORS OF GENE 
EXPRESSION, CRC PressiBoca Raton, FL (1987). 

Antisense oligonucleotides may be of any suitable length, depending 
on the particular target being bound. The only limits on the length of the antisense 
oligonucleotide is the- capacity of the virus for inserted heterologous RNA. 
Antisense oligonucleotides may be complementary to the entire mRNA transcript 
of the target gene or only a portion thereof. Preferably the antisense 
oligonucleotide is directed to an mRNA region containing a junction between 
intron and exon. Where the antisense oligonucleotide is directed to an intron/exon 
junction, it may either entirely overlie the junction or may be sufficiently close to 
the junction to inhibit splicing out of the intervening exon during processing of 
precursor mRNA to mature mRNA (e.g., with the 3* or 5' terminus of the 
antisense oligonucleotide being positioned within about, for example, 10, 5, 3 or 
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2 nucleotides of the intron/exon junction). Also preferred are antisense 
oligonucleotides which overlap the initiation codon. 

When practicing the present invention, the antisense oligonucleotides 
administered may be related in origin to the species to which it is administered. 
5 When treating humans, human antisense may be used if desired. 

Promoters for use in carrying out the present invention are operable 
in bone marrow cells. An operable promoter in bone marrow cells is a promoter 
that is recognized by and functions in bone marrow cells. Promoters for use with 
the present invention must. also be operatively associated with the heterologous 

10 RNA to be expressed in the bone marrow. A promoter is operably linked to a 
heterologous RNA if it controls the transcription of the heterologous RNA, where 
the heterologous RNA comprises a coding sequence. Suitable promoters are well 
known in the art. The Sindbis 26S promoter is preferred when the alphavirus is 
a strain of Sindbis virus. Additional preferred promoters beyond the Sindbis 26S 

15 promoter include the Girdwood S.A. 26S promoter when the alphavirus is 
Girdwood S.A., the S.A.AR86 26S promoter when the alphavirus is S.A.AR86, 
and any other promoter sequence recognized by alphavirus polymerases. 
Alphavirus promoter sequences containing mutations which alter the activity level 
of the promoter (in relation to the activity level of the wild-type) are also suitable 

20 in the practice of the present invention. Such mutant promoter sequences are 
described in Raju and Huang, /. Virol. 65, 2501-2510 (1991), the disclosure of 
which is incorporated in its entirety by reference. 

The heterologous RNA is introduced into the bone marrow cells by 
contacting the recombinant alphavirus carrying the heterologous RNA segment to 
25 - the bone marrow cells. By contacting, it is meant bringing the recombinant 
alphavirus and the bone marrow cells in physical proximity. The contacting step 
can be performed in vitro or in vivo. In vitro contacting can be carried out with 
cultures of immortalized or non- immortalized bone marrow cells. In one particular 
embodiment, bone marrow cells can be removed from a subject, cultured in vitro, 
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infected with the vector, and then introduced back into the subject. Contacting is 
performed in vivo when the recombinant alphavirus is administered to a subject. 
Pharmaceutical formulations of recombinant alphavirus can be administered to a 
subject parenteral* (e.g. . subcutaneous, intracerebral, intradermal, intramuscular, 
intravenous and intraarticular) administration. Alternatively, pharmaceutical 
formulations of the present invention may be suitable for administration to the 
mucus membranes of a subject (e.g., intranasal administration, by use of a 
dropper, swab, or inhaler). Methods of preparing infectious virus panicles and 
pharmaceutical formulations thereof are discussed in more detail hereinbelow. 

By "introducing" the heterologous RNA segment into the bone 
marrow cells it is meant infecting the bone marrow cells with recombinant 
alphavirus containing the heterologous RNA, such that the viral vector carrying the 
heterologous RNA enters the bone marrow cells and can be expressed therein. As 
used with respect to the present invention, when the heterologous RNA is 
"expressed," it is meant that the heterologous RNA is transcribed. In particular 
embodiments of the invention in which it is desired to produce a protein or 
peptide, expression further includes the steps of post-transcriptional processing and 
translation of the mRNA transcribed from the heterologous RNA. In contrast, 
where the heterologous RNA encodes an antisense oligonucleotide, expression need 
not include posL-transcriptional processing and translation. With respect to 
embodiments in which the heterologous RNA encodes an immunogenic protein or 
a protein being administered for therapeutic purposes, expression may also include 
the further step of post-translational processing to produce an immunogenic or 
therapeutically-active protein. 

The present invention also provides infectious RNAs, as described 
hereinabove, and cDNAs encoding the same. Preferably the infectious RNAs and 
cDNAs are derived from the S.A.AR86, Girdwood S.A., TR339, or Ockelbo 
viruses. The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
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forth in United States Patent No. 5,185,440 to Davis et al., the disclosure of which 
is incorporated in its entirety by reference, and Gubler et al.. Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after introduction of the cDNA. 



10 



A. Double Promoter Vprfnrc 

In one embodiment of the invention, double promoter vectors are 
used to introduce the heterologous RNA into the target bone marrow cells. A 
double promoter virus vector is a replication and propagation competent virus. 
Double promoter vectors are described in United States Patent No. 5,505,947 to 
Johnston et al. , the disclosure of which is incorporated in its entirety by reference. 
Preferred alphaviruses for constructing the double promoter vectors are S.A.AR86, 
Girdwood S.A., TR339 and Ockelbo viruses. More preferably, the double 
15 promoter vector contains one or more attenuating mutations. Attenuating 
mutations are described in more detail hereinabove. 



The double promoter vector is constructed so as to contain a second 
subgenomic promoter (i.e., 26S promoter) inserted 3' to the virus RNA encoding 
the structural proteins. The heterologous RNA is inserted between the second 
subgenomic promoter, so as to be operatively associated therewith, and the 3' 
UTR of the virus genome. Heterologous RNA sequences of less than 3 kilobases, 
more preferably those less than 2 kilobases, and more preferably still those less 
than 1 kilobase, can be inserted into the double promoter vector. In a preferred 
embodiment of the invention, the double promoter vector is derived from 
Girdwood S.A., and the second subgenomic promoter is a duplicate of the 
Girdwood S.A. subgenomic promoter. In an alternate preferred embodiment, the 
double promoter vector is derived from TR339, and the second subgenomic 
promoter is a duplicate of the TR339 subgenomic promoter. 



20 
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B. Replicon Vectors. 

Replicon vectors, which are propagation-defective virus vectors can 
also be used to carry out the present invention. Replicon vectors are described in 
more detail in copending United States Application 08/448,630 to Johnston et al. t 
the disclosure of which is incorporated in its entirety by reference. Preferred 
alphaviruses for constructing the replicon vectors are S.A.AR86, Girdwood S.A., 
TR339, and Ockelbo. 

In general, in the replicon system, a foreign gene to be expressed 
is inserted in place of at least one of the viral structural protein genes in a 
transcription plasmid containing an otherwise full-length cDNA copy of the 
alphavirus genome RNA. RNA transcribed from this plasmid contains an intact, 
copy of the viral nonstructural genes which are responsible for RNA replication 
and transcription. Thus, if the transcribed RNA is transfected into susceptible 
cells, it will be replicated and translated to give the nonstructural proteins. These 
proteins will transcribe the transfected RNA to give high levels of subgenomic 
mRNA, which will then be translated to produce high levels of the foreign protein. 
The autonomously replicating RNA (i.e. , replicon) can only be packaged into virus 
particles if the alphavirus structural protein genes are provided on one or more 
"helper" RNAs, which are cotransfected into cells along with the replicon RNA. 
The helper RNAs do not contain the viral nonstructural genes for replication, but 
these functions are provided in trans by the replicon RNA. Similarly, the 
transcriptase functions translated from the replicon RNA transcribe the structural 
protein genes on the helper RNA, resulting in the synthesis of viral structural 
proteins and packaging of the replicon into virus-like particles. As the packaging 
or encapsidation signal for alphavirus RNAs is located within the nonstructural 
genes, the absence of these sequences in the helper RNAs precludes their 
incorporation into virus particles. 

Alphavirus-perrnissive cells employed in the methods of the present 
invention are cells which, upon transfection with the viral RNA transcript, are 
capable of producing viral particles. Preferred alphavirus-perrnissive cells are 
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TR339-permissive cells, Girdwood S.A.-permissive cells, S.A.AR86-perrnissiveceIls, and 
Ockelbo-pennissive cells. Alphaviruses have a broad host range. Examples of suitable 
host cells include, but are not limited to Vero cells, baby hamster kidney (BHK) cells, and 
chicken embryo fibroblast cells. 



5 The phrase "structural protein" as used herein refers to the encoded 

proteins which are required for encapsidation (e.g., packaging) of the RNA replicon, and 
include the capsid protein. El glycoprotein, and E2 glycoprotein. As described 
hereinabove, the structural proteins of the alphavirus are distributed among one or more 
helper RNAs (i.e., a first helper RNA and a second helper RNA) In addition, one or 
10 more structural proteins may be located on the same RNA molecule as the replicon RNA, 
provided that at least one structural protein is deleted from the replicon RNA such that the 
resulting alphavirus particle is propagation defective. As used herein, the terms "deleted" 
or "deletion" mean either total deletion of the specified segment or the deletion of a 
sufficient portion of the specified segment to render the segment inoperative or 
15 nonfunctional, in accordance with standard usage. See, e.g., U.S. Patent No. 4,650,764 
to Temin et al. The term "propagation defective" as used herein, means that the replicon 
RNA cannot be encapsidated in the host cell in the absence of the helper RNA. The 
resulting alphavirus replicon particles are propagation defective inasmuch as the replicon 
RNA in these particles does not include all of the alphavirus structural proteins required 
20 for encapsidation. at least one of the required structural proteins being deleted therefrom, 
such that the replicon RNA initiates only an abortive infection; no new viral particles are 
produced, and there is no spread of the infection to other cells. 

The helper cell for expressing the infectious, propagation defective alphavirus 
particle comprises a set of RNAs, as described above. The set of RNAs principally 

15 include a first helper RNA and a second helper RNA. Tne first helper RNA includes 
RNA encoding at least one alphavirus structural protein but does not encode all alphavirus 
structural proteins. In other words, the first helper RNA. does not encode at least one 

■ alphavirus structural protein; the at least one non-coded alphavirus structural protein being 
deleted from the first helper RNA. 
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In one embodiment, the first helper RNA includes RNA encoding the alphavirus 
El glycoprotein, with the alphavirus capsid protein and the alphavirus E2 
glycoprotein being deleted from the first helper RNA. In another embodiment, the 
first helper RNA includes RNA encoding the alphavirus E2 glycoprotein, with the 
alphavirus capsid protein and the alphavirus El glycoprotein being deleted from 
the first helper RNA. In a third, preferred embodiment, the first helper RNA 
includes RNA encoding the alphavirus El glycoprotein and the alphavirus E2 
glycoprotein, with the alphavirus capsid protein being deleted from the first helper 
RNA. 

The second helper RNA includes RNA encoding at least one 
alphavirus structural protein which is different from the at least one structural 
protein encoded by the first helper RNA. Thus, the second helper RNA encodes 
at least one alphavirus structural protein which is not encoded by the first helper 
RNA. The second helper RNA does not encode the at least one alphavirus 
structural protein which is encoded by the first helper RNA, thus the first and 
second helper RNAs do not encode duplicate structural proteins. In the 
embodiment wherein the first helper RNA includes RNA encoding only the 
alphavirus El glycoprotein, the second helper RNA may include RNA encoding 
one or both of the alphavirus capsid protein and the alphavirus E2 glycoprotein 
which are deleted from the first helper RNA. In the embodiment wherein, the first 
helper RNA includes RNA encoding only the alphavirus E2 glycoprotein, the 
second helper RNA may include RNA encoding one or both of the alphavirus 
capsid protein and the alphavirus El glycoprotein which are deleted from the first 
helper RNA. In the embodiment wherein the first helper RNA includes RNA 
encoding both the alphavirus El glycoprotein and the alphavirus E2 glycoprotein, 
the second helper RNA may include RNA encoding the alphavirus capsid protein 
which is deleted from the first helper RNA. 

In one embodiment, the packaging segment (RNA comprising the 
encapsidation or packaging signal) is deleted from at least the first helper RNA. 
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In a preferred embodiment, the packaging segment is deleted from both the first 
helper RNA and the second helper RNA. 

In the preferred embodiment wherein the packaging segment is 
deleted from both the first helper RNA and the second helper RNA, the helper cell 
is co-transfected with a replicon RNA in addition to the first helper RNA and the 
second helper RNA. The replicon RNA encodes the packaging segment and an 
inserted heterologous RNA. The inserted heterologous RNA may be RNA 
encoding a protein or a peptide. In a preferred embodiment, the replicon RNA, 
the first helper RNA and the second helper RNA are provided on separate 
molecules such that a first molecule, i.e., the replicon RNA, includes RNA 
encoding the packaging segment and the inserted heterologous RNA, a second 
molecule, Le., the first helper RNA, includes RNA encoding at least one but not 
all of the required alphavirus structural proteins, and a third molecule, i.e., the 
second helper RNA, includes RNA encoding at least one but not all of the required 
alphavirus structural proteins. For example, in one preferred embodiment of the 
present invention, the helper cell includes a set of RNAs which include (a) a 

an alphavirus packaging sequence and an 
inserted heterologous RNA, (b) a first helper RNA including RNA encoding the 
alphavirus El glycoprotein and the alphavirus E2 glycoprotein, and (c) a second 
helper RNA including RNA encoding the alphavirus capsid protein so that the 
'alphavirus El glycoprotein, the alphavirus E2 glycoprotein and the capsid protein 
assemble together into alphavirus particles in the host cell. 

In an alternate embodiment, the replicon RNA and the first helper 
RNA are on separate molecules, and the replicon RNA and RNA encoding a 
structural gene not encoded by the first helper RNA are on another single molecule 
together, such that a first molecule, 'Le., the first helper RNA, including RNA 
encoding at least one but not all of the required alphavirus structural proteins, and 
a second molecule, Le. , the replicon RNA, including RNA encoding the packaging 
segment, the inserted heterologous RNA, and the remaining structural proteins not 
encoded by the first helper RNA. For example, in one preferred embodiment of 
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the present invention, the helper cell includes a set of RNAs including (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence, an 
inserted heterologous RNA, and an alphavirus capsid protein, and (b) a first helper 
RNA including RNA encoding the alphavirus El glycoprotein and the alphavirus 
E2 glycoprotein so that the alphavirus El glycoprotein, the alphavirus E2 
glycoprotein and the capsid protein assemble together into alphavirus particles in 
the host cell, with the replicon RNA packaged therein. 

In one preferred embodiment of the present invention, the RNA 
encoding the alphavirus structural proteins, i.e., the capsid, El glycoprotein and 
E2 glycoprotein, contains at least one attenuating mutation, as described 
hereinabove. Thus, according to this embodiment, at least one of the first helper 
RNA and the second helper RNA includes at least one attenuating mutation. In 
a more preferred embodiment, at least one of the first helper RNA and the second 
helper RNA includes at least two, or multiple, attenuating mutations. The multiple 
attenuating mutations may be positioned in either the first helper RNA or in the 
second helper RNA, or they may be distributed randomly with one or more 
attenuating mutations being positioned in the first helper RNA and one or more 
attenuating mutations positioned in the second helper RNA. Alternatively, when 
the replicon RNA and the RNA encoding the structural proteins not encoded by 
the first helper RNA are located on the same molecule, an attenuating mutation 
may be positioned in the RNA which codes for the structural protein not encoded 
by the first helper RNA. The attenuating mutations may also be located within the 
RNA encoding non-structural proteins (e.g. , the replicon RNA). 

Preferably, the first helper RNA and the second helper RNA also 
include a promoter. It is also preferred that the replicon RNA also includes a 
promoter. Suitable promoters for inclusion in the first helper RNA, second helper 
RNA and replicon RNA are well known in the art. One preferred promoter is the 
Girdwood S.A. 26S promoter for use when the alphavirus is Girdwood S.A. 
Another preferred promoter is the TR339 26S promoter for use when the 
alphavirus is TR339. Additional promoters beyond the Girdwood S.A. and TR339 
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promoters include the VEE 26S promoter, the Sindbis 26S promoter, the Semliki 
Forest 26S promoter, and any other promoter sequence recognized by alphavirus 
polymerases. Alphavirus promoter sequences containing mutations which alter the 
activity level of the promoter (in relation to the activity level of the wild-type) are 
also suitable in the practice of the present invention. Such mutant promoter 
sequences are described in Raju and Huang, J. Yirol. 65, 2501-2510 (1991), the 
disclosure of which is incorporated herein in its entirety. In the system wherein 
the first helper RNA, the second helper RNA, and the replicon RNA are all on 
separate molecules, the promoters, if the same promoter is used for all three 
RNAs, provide a homologous sequence between the three molecules. It is 
preferred that the selected promoter is operative with the non-structural proteins 
encoded by the replicon RNA molecule. 



In cases where vaccination with two immunogens provides improved 
protection against disease as compared to vaccination with only a single 
immunogen, a double-promoter replicon would ensure that both immunogens are 
produced in the same cell. , Such a replicon would be the same as "the one 
described above, except that it would contain two copies of the 26S RNA 
promoter, each followed by a different multiple cloning site, to allow for the 
insertion and expression of two different heterologous proteins. Another useful 
strategy is to insert the IRES sequence from the picornavirus, EMC virus, between 
the two heterologous genes downstream from the single 26S promoter of the 
replicon described above, thus leading to expression of two immunogens from the 
single replicon transcript in the same cell. 

C. Uses o f the Prpspnt Invention 

The alphavirus vectors, RNAs, cDNAs, helper cells, infectious virus 
particles, and methods of the present invention find use in in vitro expression 
systems, wherein the inserted heterologous RNA' encodes a protein or peptide 
which is desirably produced in vitro. The RNAs, cDNAs, helper cells, infectious 
virus particles, methods, and pharmaceutical formulations of the present invention 
are additionally useful in a method of administering a protein or peptide to a 
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- subject in need of the protein or peptide, as a method of treatment or otherwise. 
In this embodiment of the invention, the heterologous RNA encodes the desired 
protein or peptide, and pharmaceutical formulations of the present invention are 
administered to a subject in need of the desired protein or peptide. In this manner, 
the protein or peptide may thus be produced in vivo in the subject. The subject 
may be in need of the protein or peptide because the subject has a deficiency 
thereof, or because the production of the protein or peptide in the subject may 
impart some therapeutic effect, as a method of treatment or otherwise. 

Alternately, the claimed methods provide a vaccination strategy, 
wherein the heterologous RNA encodes an immunogenic protein or peptide. 

The methods and products of the invention are also useful as 
antigens and for evoking the production of antibodies in animals such as horses 
and rabbits, from which the antibodies may be collected and then used in 
diagnostic assays in accordance with known techniques. 

A further aspect of the present invention is a method of introducing 

and expressing antisense oligonucleotides in bone marrow cell cultures to regulate 
gene expression. Alternately, the claimed method finds use in introducing and 
expressing a protein or peptide in bone marrow cell cultures. 

IL Girdwnnd S A. and TR339 Clones. 

Disclosed hereinbelow are genomic RNA sequences encoding live 
Girdwood S.A. virus, live S.A.AR86 virus, and live Sindbis strain TR339 virus, 
cDNAs derived therefrom, infectious RNA transcripts encoded by the cDNAs, 
infectious viral particles containing the infectious RNA transcripts, and 
pharmaceutical formulations derived therefrom. 

The cDNA sequence of Girdwood S.A. is given herein as SEQ ID 
NO:4. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:4, but which has the same protein sequence as the cDNA 
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given herein as SEQ ID NO:4. Thus, the cDNA may include one or more silent 
mutations. 



The phrase "silent mutation" as used herein refers to mutations in 
the cDNA coding sequence which do not produce mutations in the corresponding 
5 protein sequence translated therefrom. 

Likewise, the cDNA sequence of TR339 is given herein as SEQ ID 
NO:8. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ED NO:8, but which has the same protein sequence as the cDNA 
given herein as SEQ ID NO:8. Thus, the cDNA may include one or more silent 
10 mutations. 



The cDNAs encoding infectious Girdwood S.A. and TR339 virus 
RNA transcripts of the present invention include those homologous to, and having 
essentially the same biological properties as, the cDNA sequences disclosed herein 
as SEQ ID NO:4 and SEQ ID NO:8, respectively. Thus, cDNAs that hybridize 
to cDNAs encoding infectious Girdwood S.A. or TR339 virus RNA transcripts 
disclosed herein are also an aspect of this invention. Conditions which will permit 
other cDNAs encoding infectious Girdwood S.A. or TR339 virus transcripts to 
hybridize to the cDNAs disclosed herein can be determined in accordance with 
known techniques. For example, hybridization of such sequences may be carried 
out under conditions of reduced stringency, medium stringency, or even high 
stringency conditions («. g. , conditions represented by a wash stringency of 35-40 % 
formamide with 5X Denhardfs solution, 0.5% SDS and IX SSPE at 37'C; 
conditions represented by a wash stringency of 40-45% formamide. with 5X 
Denhardfs solution, 0.5% SDS, and IX SSPE at 42°C; and conditions represented 
by a wash stringency of 50% formamide with 5X Denhardfs solution, 0.5% SDS 
and IX SSPE at 42°C, respectively, to cDNA encoding infectious Girdwood S.A. 
or TR339 virus RNA transcripts disclosed herein in a standard hybridization assay. 
See J. SAMBROOK ET AL., MOLECULAR CLONING: A LABORATORY 
MANUAL (2d ed. 1989)). In general, cDNA sequences encoding infectious 
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Girdwood S.A. or TR339 virus RNA transcripts that hybridize to the cDNAs 
disclosed herein will be at least 30% homologous, 50% homologous, 75% 
homologous, and even 95% homologous or more with the cDNA sequences 
encoding infectious Girdwood S.A. or TR339 virus RNA transcripts disclosed 
herein. 

Promoter sequences and Girdwood S.A. virus or Sindbis virus strain 
TR339 cDNA clones are operatively associated in the present invention such that 
the promoter causes the cDNA clone to be transcribed in the presence of an RNA 
polymerase which binds to the promoter. The promoter is positioned on the 5' end 
(with respect to the virion RNA sequence), of the cDNA clone. An excessive 
number of nucleotides between the promoter sequence and the cDNA clone will 
result in the inoperability of the construct. Hence, the number of nucleotides 
between the promoter sequence and the cDNA clone is preferably not more than 
eight, more preferably not more than five, still more preferably not more than 
three, and most preferably not more than one. 

Examples of promoters which are useful in the cDNA sequences of 
the present invention include, but are not limited to T3 promoters, T7 promoters, 
cytomegalovirus (CMV) promoters, and SP6 promoters. The DNA sequence of 
the present invention may reside in any suitable transcription vector. The DNA 
sequence preferably has a complementary DNA sequence bound thereto so that the 
double-stranded sequence will serve as an active template for RNA polymerase. 
The transcription vector preferably comprises a plasmid. When the DNA sequence 
comprises a plasmid, it is preferred that a unique restriction site be provided 3' 
(with respect to the virion RNA sequence) to the cDNA clone. This provides a 
means for linearizing the DNA sequence to allow the transcription of genome- 
length RNA in vitro. 

The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
forth in United States Patent No. 5,185,440 to Davis et al., the disclosure of which 
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is incorporated in its entirety by reference, and Gubler et al., Gene 25:263 (1983) 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after introduction of the cDNA. 

The Girdwood S.A. and TR339 cDNA clones and the infectious 
RNAs and infectious virus panicles produced therefrom of the present invention 
are useful for the preparation of pharmaceutical formulations, such as vaccines. 
In addition, the cDNA clones, infectious RNAs, and infectious viral particles of 
the present invention are useful for administration to animals for the purpose of 
producing antibodies to the Girdwood S.A. virus or the Sindbis virus strain 
TR339, which antibodies may be collected and used in known diaenostic 
techniques for the detection of Girdwood S. A. virus or Sindbis virus strain TR339. 
Antibodies can also be generated to the viral proteins expressed from the cDNAs 
disclosed herein. As another aspect of the present invention, the claimed cDNA 
clones are useful as nucleotide probes to detect the presence of Girdwood S.A. or 
TR339 genomic RNA or transcripts. 

m. Infectious Vims Particles and Ph a rm aceil ti ra i Fnn.nhti.n, 

The infectious virus particles of the present invention include those 
containing double promoter vectors and those containing replicon vectors as 
described hereinabove. Alternately, the infectious virus particles contain infectious 
RNAs encoding the Girdwood S.A. or TR339 genome. When the infectious RNA 
comprises the Girdwood S.A. genome, preferably the RNA has the sequence 
encoded by the cDNA given as SEQ ID NO:4. When the infectious RNA 
comprises the TR339 genome, preferably the RNA has the sequence encoded by 
the cDNA given as SEQ ID NO:8. 

The infectious, alphavirus panicles of the present invention may be 
prepared according to the methods disclosed herein in combination with techniques 
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known to those skilled in the an. These methods include transfecting an 
alphavirus-permissive cell with a replicon RNA including the alphavirus packaging 
segment and an inserted heterologous RNA. a first helper RNA including RNA 
encoding at least one alphavirus structural protein, and a second helper RNA 
including RNA encoding at least one alphavirus structural protein which is 
different from that encoded by the first helper RNA. Alternately, and preferably, 
at least one of the helper RNAs is produced from a cDNA encoding the helper 
RNA and operably associated with an appropriate promoter, the cDNA being 
stably transfected and integrated into the cells. More preferably, all of the helper 
RNAs will be "launched" from stably transfected cDNAs. The step of transfecting 
the alphavirus-permissive cell can be carried out according to any suitable means 
known to those skilled in the an, as described above with respect to propagation- 
competent viruses. 



Uptake of propagation-competent RNA into the cells in vitro can be 
carried out according to any suitable means known to those skilled in the art. 
Uptake of RNA into the cells can be achieved, for example, by treating the cells 
with DEAE-dextran, treating the RNA with LIPOFECTIN® before addition to the 
cells, or by electroporation, with electroporation being the currently preferred 
means. These techniques are well known in the an. See e.g.. United States 
Patent No. 5,185,440 to Davis et al., and PCT Publication No. WO 92/10578 to 
Bioption AB, the disclosures of which are incorporated herein by reference in their 
entirety. Uptake of propagation-competent RNA into the cell in vivo can be 
carried out by administering the infectious RNA to a subject as described in 
Section I above. 

The infectious RNAs may also contain a heterologous RNA 
segment, where the heterologous RNA segment contains a heterologous RNA and 
a promoter operably associated therewith. It is preferred that the infectious RNA 
introduces and expresses the heterologous RNA in bone marrow cells as described 
in Section I above. According to this embodiment, it is preferable that the 
promoter operatively associated with the heterologous RNA is operable in bone 
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marrow cells. The heterologous RNA may encode any protein or peptide, 
preferably an immunogenic protein or peptide, a therapeutic protein or peptide, a 
hormone, a growth factor, an interleukin, a cytokine, a chemokine, an enzyme, 
a ribozyme, or an antisense oligonucleotide as described in more detail in Section 
5 I above. 



The step of facilitating the production of the infectious viral particles 
in the cells may be carried out using conventional techniques. See e.g., United 
States Patent No. 5,185,440 to Davis et al., PCT Publication No. WO 92/10578 
to Bioption AB, and United States Patent No. 4,650,764 to Temin et al. (although 
10 Temin et al. , relates to retroviruses rather than alphaviruses). The infectious viral 
particles may be produced by standard cell culture growth techniques. 

The step of collecting the infectious virus particles may also be 
carried out using conventional techniques. For example, the infectious particles 
may be collected by cell lysis, or collection of the supernatant of the cell culture, 

15 as is known in the art. See e.g. , United States Patent No. 5,185,440 to Davis et 

al., PCT Publication No. WO 92/10578 to Bioption AB, and United States Patent 
No. 4,650,764 to Temin et al. Other suitable techniques will be known to those 
skilled in the art. Optionally, the collected infectious virus particles may be 
purified if desired. Suitable purification techniques are well known to those skilled 

20 in the art. 



Pharmaceutical formulations, such as vaccines, of the present 
invention comprise an immunogenic amount of the infectious, virus particles in 
combination with a pharmaceutical ly acceptable carrier. An "immunogenic 
amount" is an amount of the infectious virus panicles which is sufficient to evoke 
an immune response in the subject to which the pharmaceutical formulation is 
administered. An amount of from about 10 3 to about 10 7 particles, and preferably 
about 10 4 to 10 6 particles per dose is believed suitable, depending upon the age and 
species of the subject being treated, and the immunogen against which the immune 
response is desired. 
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Phannaceutical formulations of the present invention for therapeutic 
use comprise a therapeutic amount of the infectious virus particles in combination 
with a pharmaceutical^ acceptable carrier. A "therapeutic amount" is an amount 
of the infectious virus particles which is sufficient to produce a therapeutic effect 
{e.g. , triggering an immune response or supplying a protein to a subject in need 
thereof) in the subject to which the pharmaceutical formulation is administered. 
The therapeutic amount will depend upon the age and species of the subject being 
treated, and the therapeutic protein or peptide being administered. Typical dosages 
are an amount from about 10 l to about 10 5 infectious units. 

Exemplary pharmaceutically acceptable carriers include, but are not 
limited to, sterile pyrogen-free water and sterile pyrogen-free physiological saline 
solution. Subjects which may be administered immunogenic amounts of the 
infectious virus particles of the present invention include but are not limited to 
human and animal (e.g., pig, cattle, dog, horse, donkey, mouse, hamster, 
monkeys) subjects. 

Pharmaceutical formulations of the present invention include those 
suitable for parenteral (e.g., subcutaneous, intracerebral, intradermal, 
intramuscular, intravenous and intraarticular) administration. Alternatively, 
pharmaceutical formulations of the present invention may be suitable for 
.administration to the mucus membranes of a subject (e.g. , intranasal adrninistration 
by use of a dropper, swab, or inhaler). The formulations may be conveniently 
prepared in unit dosage form and may be prepared by any of the methods well 
known in the art. 

The following examples are provided to illustrate the present 
invention, and should not be construed as limiting thereof. In these examples, 
PBS means phosphate buffered saline, EDTA means ethylene diamine tetraacetate, 
ml means milliliter, M l means microliter, mM means millimolar, M M means 
micromolar, u means unit, PFU means plaque forming units, g means gram, mg 
means milligram, M g means microgram, cpm means counts per minute, ic means 
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intracerebral or intracerebrally, ip means intraperitoneal or intraperitoneally, iv 
means intravenous or intravenously, and sc means subcutaneous or subcutaneously . 

Amino acid sequences disclosed herein are presented in the amino 
to carboxyl direction, from left to right. The amino and carboxyi groups are not 
presented in the sequence. Nucleotide sequences are presented herein by single 
strand only in the 5' to 3' direction, from left to right. Nucleotides and amino 
acids are represented herein in the manner recommended by the IUPAC-IUB 
Biochemical Nomenclature Commission, or (for amino acids) by either one letter 
or three letter code, in accordance with 37 CFR § 1.822 and established usage. 
Where one letter amino acid code is used, the same sequence is also presented 
elsewhere in three letter code. 



EXAMPLE I 
Cells and Virus Stocks 
S.A.AR86 was isolated in 1954 from a pool of Culex sp. mosquitoes 
collected near Johannesburg, South Africa. Weinbren et al., S. Afr. Med. /. 30, 
631-36 (1956). Ockelbo82 was isolated from Culiseta sp. mosquitoes collected in 
Edsbyn, Sweden in 1982 and was associated serologically with human disease. 
Niklasson et al., Am. /. Trop. Med. Hyg. 33, 1212-17 (1984). Girdwood S.A. 
was isolated from a human patient in the Johannesburg area of South. Africa in 
1963. Malherbe et al., 5. Afr. Med. J. 37, 547-52 (1963). Molecularly cloned 
virus TR339 represents the deduced consensus sequence of Sindbis AR339. 
McKnight et al., J. Virol. 70, 1981-89 (1996); William Klimstra, personal 
communication. TRSB is a laboratory strain of Sindbis isolate AR33 9 derived 
from a cDNA clone pTRSB and differing from the AR339 consensus sequence at 
three codons. McKnight et al., /. Virol. 70, 1981-89 (1996). pTR5000 is a full- 
length cDNA clone of Sindbis AR339 following the SP6 phage promoter and 
containing mostly Sindbis AR339 sequences. 

Stocks of all molecularly cloned viruses were prepared by 
electroporating genome length in vitro transcripts of their respective cDNA clones 
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in BHK-21 cells. Heidner et al. f J. Virol. 68, 2683-92 (1994). Girdwood S.A. 
(Malherbe et al. f S. Afr. Med. J. 37, 547-52 (1963)) and Ockelbo82 (Espmark and 
Niklasson, Am. J. Trop. Med. Hyg. 33, 1203-11 (1984); Niklasson et al., An. /. 
Trop. Med. Hyg. 33, 1212-17 (1984)) were passed one to three times in BHK-21 
cells in order to produce amplified stocks of vims. All virus stocks were 
stored at -70 °C until needed. The titers of the virus stocks were determined on 
BHK-21 cells from aliquots of frozen virus. 

EXAMPLE 2 

Cloning th e S.A.AR86 and Girdwood S.A. Genomic Seg npnrpg 

The sequences of S.A.AR86 (Figure 1, SEQ ID NO: 1) and 
Girdwood S.A. (Figure 3, SEQ ED NO:4) were determined from uncloned reverse 
transcriptase-polymerase chain reaction (RT-PCR) fragments amplified from virion 
RNA. Heidner et al., J. Virol. 68, 2683-92 (1994), The sequence of the 5' 40 
nucleotides was determined by directly sequencing the genomic RNA. Sanger et 
15 al. t Proc. Natl. Acad. ScL USA 74, 5463-67 (1977); Zimmern and Kaesberg, 

Proc. Natl. Acad. ScL USA 75, 4257-61 (1978); Ahlquist et al., Cell 23, 183-89 
(1981). 

The S.A.AR86 genome was 11,663 nucleotides in length, excluding 
the 5' CAP and 3'poly(A) tail, 40 nucleotides shorter than the alphavirus prototype 

20 'Sindbis strain AR339. Strauss et al., Virology 133, 92-110 (1984). Compared 
with the consensus sequence of Sindbis vims AR339 (McKnight et al., /. Virol. 
70 1981-89 (1996)), S.A.AR86 contained two separate 6-nucleotide insertions, and 
one 3-nucieotide insertion in the 3' half of the nsP3 gene, a region not well 
conserved among alphaviruses. The two 6-nucleotide insertions were found 

25 immediately 3' of nucleotides 5403 and 5450, and the 3-nucleotide insertion was 

immediately 3 ' of nucleotide 5546 compared with the AR339 genome. In addition, 
S.A.AR86 contained a 54-nucleotide deletion in nsP3 which spanned nucleotides 
5256 to 5311 of AR339. As a result of these deletions and insertions, S.A.AR86 
nsP3 was 13 amino acids smaller than AR339, containing an 18-amino acid 

30 deletion and a total of 5 amino acids inserted. The 3' untranslated region of 
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S.A.AR86 contained, with respect to AR339, two 1-nucleotide deletions at 
nucleotides 11,513 and 11,602, and one 1-nucleotide insertion following nucleotide 
11,664. The total numbers of nucleotides and predicted amino acids comprising 
the remaining genes of S.A.AR86 were identical to those of AR339, 

5 A notable feature of the deduced amino acid sequence of S.A.AR86 

(Figure 2, SEQ ID NO:2 and SEQ ID NO:3) was the cysteine codon in place of 
an opal termination codon between nsP3 and nsP4. S.A.AR86 is the only 
alpha virus of the Sindbis group, and one of just three alphavirus isolates sequenced 
to date, which do not contain an opal termination codon between nsP3 and nsP4. 
10 Takkinen, K., Nucleic Acids Res. 14, 5667-5682 (1986); Strauss et al., Virology 
164, 265-74 (1988). 

The genome of Girdwood S.A. was 11,717 nucleotides long 
excluding the 5' CAP and 3' poly (A) tail. The nucleotide sequence (SEQ ID 
NO: 4) of the Girdwood S.A. genome and the putative amino acid sequence (SEQ 

15 ID NO:5 and SEQ ID NO:6) of the Girdwood S.A. gene products are shown in 
Figure 3 and Figure 4, respectively. The asterisk at position 1902 in SEQ ID 
NO:5 indicates the position of the opal termination codon in the coding region of 
the nonstructural polyprotein. The extra nucleotides relative to AR339 were in the 
nonconserved half of nsP3, which contained insertions totalling 15 nucleotides, and 

20 in the 3 ' untranslated region which contained two 1-nucleotide deletions and a 1- 
nucleotide insertion with respect to the consensus Sindbis AR339 genome. The 
insertions found in the nsP3 gene of Girdwood S.A. were identical in position and 
content to those found in S.A.AR86, although Girdwood S.A. did not have the 
large nsP3 deletion characteristic of S.A.AR86. The remaining portions of the 

25 . genome contained the same number of nucleotides and predicted amino acids as 
Sindbis AR339. 

Overall, Girdwood S.A. was 94.5% identical to the consensus 
Sindbis AR339 sequence, differing at 655 nucleotides not including the insertions 
and deletions. These nucleotide differences resulted in 88 predicted amino acid 
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changes or a difference of 2.3%. A plurality of amino acid differences were 
concentrated in the nsP3 gene, which contained 32 of the amino acid changes, 25 
of which were in the nonconserved 3' half. 

The Girdwood S.A. nucleotides at positions 1, 3 f and 11,717 could 
not be resolved. Because the primer used during the RT-PCR amplification of the 
3' end of the genome assumed a cytosine in the 3' terminal position, the identity 
of this nucleotide could not be determined with certainty. However, in all 
alphaviruses sequenced to date there is a cytosine in this position. This, combined 
with the fact that no difficulty was encountered in obtaining RT-PCR product for 
this region with an oligo(dT) primer ending with a 3'G, suggested that Girdwood 
S.A. also contains a cytosine at this position. The ambiguity at nucleotide 
positions 1 and 3 resulted from strong stops encountered during the RNA 
sequencing. 



EXAMPLE 3 
Comparison of S.A.AR86 and Girdwood S.A. 
Sequences With Other Sindbis-Related Virus Seq uent 

Table 1 examines the relationship of S. A.AR86 and Girdwood S.A; 
to each other and to other Sindbis-related viruses. This was accomplished by 
aligning the nucleotide and deduced amino acid sequences of Ockelbo82, AR339 
and Girdwood S.A. to those of S.A.AR86 and then calculating the percentage 
identity for each gene using the programs contained within the Wisconsin GCG 
package (Genetics Computer Group, 575 Science Drive, Madison WI 53711); as 
described in more detail in McKnight et al., J. Virol. 70, 1981-89 (1996). 

The analysis suggests that S.A.AR86 is most similar to the other 
South African isolate, Girdwood S.A. , and that the South African isolates are more 
similar to the Swedish Ockelbo82 isolate than to the Egyptian Sindbis AR339 
isolate. These results also suggest that it is unlikely that S.A.AR86 is a 
recombinant virus like WEE virus. Hahn et al., Proc. NatL Acad. ScL USA 85, 
5997-6001 (1988). 
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EXAMPLE 4 
Neurovirulenc e of S.A.ARS6 and Girdwood S A 
Girdwood S.A. , Ockelbo82, and S. A.AR86 are related by sequence; 
in contrast, it has previously been reported that only S.A.AR86 displayed the adult 
mouse neurovirulence phenotype. Russell et al., /. Virol. 63, 1619-29 (1989). 
These findings were confirmed by the present investigations. Briefly, groups of 
four female CD-I mice (3-6 weeks of age) were inoculated ic with 10 3 plaque- 
forming units (PFU) of S.A.AR86, Girdwood S.A., or Ockelbo82. Neither 
Girdwood S.A. nor Ockelbo82 infection produced any clinical signs of infection. 
Infection with S.A.AR86 produced neurological signs within four to five days and 
ultimately killed 100% of the mice as previously demonstrated. 



Table 2 lists those amino acids of S.A.AR86 which might explain 
the neurovirulence phenotype in adult mice. A position was scored as potentially 
related to the S.A.AR86 adult neurovirulence phenotype if the S.A.AR86 amino 
acid differed from that which otherwise was absolutely conserved at that position 

in the other viruses. 



TABLE 2 



Divergent Amino Acids in S.A.ARS6 
Potentially Related to the Adult Neurtmrulence Phenotype 





Position in 
S.A.AR86 


S.A.AR8S 
Amino Acid 


Conserved 
Amino Acid 


nsPI 


583 


Thr 


He 


nsP2 


256 


Arg 


Ata 




648 


lie 


Val 




651 


Lys 


Glu 


nsP3 


344 


Gly 


Glu 




386 


Tyr 


Ser 




441 


Asp 


Gly 




445 


" lie 


Met 




537 


Cys 


Opal 


E2 


243 


Ser 


Leu 


6K ' 


30 


Vat 


tie 


El 


112 


Val 


Ala 




169 


Leu 


Ser 
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EXAiMPLE 5 
dS55 Molecular Clone of S.A. AR86 
As a first step in investigating the unique adult mouse 
neurovirulence phenotype of S.A.AR86, a full-length cDNA clone of the 
5 S. A.AR86 genome was constructed. The sources of cDNA included conventional 
cDNA clones (Davis et al., Virology 171, 189-204 (1989)) as well as uncloned 
RT-PCR fragments derived from the S . A. AR86 genome. As described previously, 
these were substituted, starting at the 3' end, into pTR5000 (McKnight et al., /. 
Virol 70, 1981-89 (1996)), a full-length Sindbis clone from which infectious 
10 . genomic replicas could be derived by transcription with SP6 polymerase in vitro. 

The end result was pS55 t a molecular clone of S.A.AR86 from 
which infectious transcripts could be produced and which contained four nucleotide 
changes (G for A at nt 215; G for C at nt 3863; G for A at nt 5984; and C for T 
at nt 9113) but no amino acid coding differences with respect to the S.A.AR86 
15 genomic RNA (amino acid sequence of S.A.AR86 presented in Figure 2 (SEQ H) 
NO:2 and SEQ ID NO:3)). The nucleotide sequence of clone pS55 is presented 
in Figure 5 (SEQ ID NO:7). 

As has been described by Simpson et al., Virology 222, 464-69 
(1996), neurovirulence and replication of the virus derived from pS55 (S55) were 

20 compared with those of S.A.AR86. It was found that S55 exhibits the distinctive 
adult neurovirulence characteristic of S.A.AR86. Like S.A.AR86, S55 produces 
100% mortality in adult mice infected with the virus and the survival times of 
animals infected : with both viruses were indistinguishable. In addition, S55 and 
S.A.AR86 were found to replicate to essentially equivalent titers in vivo, and the 

25 profiles of S55 and S.A.AR86 virus growth in the central nervous system and 
periphery were very similar. 

From these data it was concluded that the silent changes found in 
virus derived from clone pS55 had little or no effect on its growth or virulence, 
and that this molecularly cloned virus accurately represents the biological isolate, 
30 S.A.AR86. 
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EXAMPLE 6 
Construction of the Consensus AR339 Virus TR33Q 
The consensus sequence of the Sindbis vims AR339 isolate, the 
prototype alphavirus was deduced. The consensus AR339 sequence was inferred 
by comparison of the TRSB sequence (a laboratory-derived AR339 strain) with the 
complete or partial sequences of HR, p (the Gen Bank sequence; Strauss et aL, 
Virology 133, 92-110.(1984)), SV1A. and NSV (AR339-derived laboratory strains; 
Lustig et aL, J. Virol 62, 2329-36 (1988)), and SIN (a laboratory-derived AR339 
strain; Davis et al., Virology 161, 101-108 (1987), Strauss et aL, /. Virol 65, 
4654-64(1991)). Each of these viruses was descended from AR33 9. Where these 
sequences differed from each other, they also were compared with the amino acid 
sequences of other viruses related to Sindbis virus: Ockelbo82, S.A.AJR86, 
Girdwood S.A., and the somewhat more distantly related Aura virus. Rumenapf 
et aL, Virology 208, 621-33 (1995). 



15 



20 



25 



The details of determining a consensus AR339 sequence and 
constructing the consensus virus TR339 have been described elsewhere. McKni<*ht 
et aL, J. Virol. 70, 1981-89 (1996); Kiimstra et aL, manuscript in preparation. 
The nucleotide (SEQ ED NO:8) sequence of pTR339 is presented in Figure 6. 
The deduced amino acid sequences of the pTR339 non-structural and structural 
polyproteins are shown as SEQ ID NO:9 and SEQ ID NO:10, respectively. The 
asterisk at position 1897 in SEQ ED NO:9 indicates the position of the opal 
'termination codon in the coding region of the nonstructural polyprotein. The 
consensus nucleotide sequence diverged from the pTRSB sequence at three coding 
positions (nsP3:528, E2 1, and El 72). These differences are illustrated in Table 
3. 



30 



TABLE 3 

Amino Acid Differences Between 
Laboratory Strain TRSB and Molecular Clone TR339 





nsP3 528 (nt5683) 


E2 1 (nt8633) 


E1 72 (nt10279) 


TR339 


Arg (CGA) 


Ser (AGO 


Ala (GCU) 


TRSB 


Gin (CAA) 


Arg (AGA) 


Val (GUU) 
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EXAiMPLE 7 
Animals Used for In Vivo Localization Studies 
Specific pathogen free CD-I mice were obtained from Charles River 
Breeding Laboratories (Raleigh, North Carolina) at 21 days of age and maintained 
5 under barrier conditions until approximately 37 days of age. Intracerebral (ic) 
inoculations were performed as previously described, Simpson et al., Virol. 222, 
464-49 (1996), with 500 PFU of S51 (an attenuated mutant of S55) or 10 3 PFU of 
S55. Animals inoculated peripherally were first anesthetized with METOFANE®. 
Then, 25 fil of diluent (PBS, pH 7.2, 1% donor calf serum, 100 u/ml penicillin, 

10 50 /ig/ml streptomycin, 0.9 mM CaCl 2 , and 0.5 mM MgCy containing 10 3 PFU 
of virus were injected either intravenously (iv) into the tail vein, subcutaneous ly 
(sc) into the skin above the shoulder blades on the middle of the back, or 
intraperitoneally (ip) in the lower right abdomen. Animals were sacrificed at 
various times post-inoculation as previously described. Simpson et al. , Virol. 222, 

15 464-49 (1996). Brains (including brainstems) were homogenized in diluent to 30% 
w/v, and right quadriceps were homogenized in diluent to 25% w/v. Homo^enates 
were handled and titered as described previously. Simpson et al. , Virol. 222, 464- 
49 (1996). Bone marrow was harvested by crushing both femurs from each animal 
in sufficient diluent to produce a 30% w/v suspension (calculated as weight of 

20 uncrushed femurs in volume of diluent). Samples were stored at -70°C. For 
titration, samples were thawed and clarified by centrifugation at 1,000 x g for 20 
minutes at 4°C before being titered by conventional plaque assay on BHK-21 cells. 

EXAMPLE 8 
Tissue Preparation for In Situ Hybridization Studies 

25 Animals were anesthetized by ip injection of 0.5 ml AVERTIN® at 

various times post- inoculation followed by perfusion with 60 to 75 ml of 4% 
paraformaldehyde in PBS (pH 7.2) at a flow rate of 10 ml per minute. The entire 
carcass was decalcified for 8 to 10 weeks in 4% parafomaldehyde containing 8% 
EDTA in PBS (pH 6.8) at 4°C. This solution was changed twice during the 

30 decalcification period; Selected tissues were cut into blocks approximately 3 mm 
thick and placed into biopsy cassettes for paraffin embedding and sectioning. 
Blocks were embedded, sectioned and hematoxylin/eosin stained by Experimental 
Pathology Laboratories (Research Triangle Park, North Carolina) or North 
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Carolina State University Veterinary School Pathology Laboratory (Raleigh, North 
Carolina). 

EXAMPLE 9 
In Situ Hvbridizatinn 
Hybridizations were performed using a [ 35 S]-UTP labeled S. A.AR86 
specific riboprobe derived from pDS-45. Clone pDS-45 was constructed by first 
amplifying a 707 base pair fragment from pS55 by PCR using primers 7241 (5'- 
CTGCGGCGGATTCATCTTGC-3', SEQ ED NO:ll) and SC-3 (5'- 
CTCC AACTTAAGTG-3 ' , SEQ ID NO: 12). The resulting 707 base pair fragment 
was purified using a GENE CLEAN® kit (BiolOl, CA), digested with Hhal, and 
cloned into the Smal site of pSP72 (Promega). Linearizing pDS-45 with EcoRV 
and performing an in vitro transcription reaction with SP6 DNA-dependent, RNA 
polymerase (Promega) in the presence of [ 3S S]-UTP resulted in a riboprobe 
approximately 500 nucleotides in length of which 445 nucleotides were 
complementary to the S.A.AR86 genome (nucleotides 7371 through 7816). A 
riboprobe specific for the influenza strain PR-8 hemagglutinin (HA) gene was used 
as a control probe to test non-specific binding. The in situ hybridizations were 
performed as described previously (Charles etal., Virol. 208, 662-71 (1995)) using 
10 5 cpm of probe per slide. 



EXAMPLE 10 
Replication of S.A.AR86 in Bone Marrow 
Three groups of six adult mice each were inoculated peripherally 
(sc, ip, or iv) with 1200 PFU of S55 (a molecular clone of S.A.AR86) in 25 y\ 
of diluent. Under these conditions, the infection produced no morbidity or 
mortality. Two mice from each group were anesthetized and sacrificed at 2, 4 and 
6 days post-inoculation by exsanguination. The serum, brain (including 
brainstem), right quadricep, and both femurs were harvested and titered by plaque 
assay. Virus was never detected in the quadricep samples of animals inoculated 
sc (Table 4). A single animal inoculated ip (two days post-inoculation) and two 
mice inoculated iv (at four and six days post-inoculation) had detectable virus in 
the right quadricep, but the titer was at or just above the limit of detection (6.25 
PFU/e tissue). Virus was present sporadically or at low levels in the brain and 
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serum of animals regardless of the route of inoculation. Virus was detected in the 
bone marrow of animals regardless of the route of inoculation. However, the 
presence of virus in bone marrow of animals inoculated sc or ip was more sporadic 
than animals inoculated iv, where five out of six animals had detectable vims. 
5 These results suggest that S55 targets to the bone marrow, especially following iv 
inoculation. 

The level and frequency of virus detected in the serum and muscle 
suggested that virus detected in the bone marrow was not residual virus 
contamination from blood or connective tissue remaining in bone marrow samples. 

10 The following experiment also suggested that virus in bone marrow was not due 
to tissue or serum contamination. Mice were inoculated ic with 1200 PFU of S55 
in 25 fil of diluent. Animals were sacrificed at 0.25, 0.5, 1, 1.5, 2, 3, 4, 5, and 
6 days post-inoculation, and the carcasses were decalcified as described in 
Example 8. Coronal sections taken at approximately 3 mm intervals through the 

15 head, spine (including shoulder area), and hips were probed with an S55-specific 
[ 35 S]-UTP labeled riboprobe derived from pDS-45. Positive in situ hybridization 
signal was detected by one day post- inoculation in the bone marrow of the skull 
(data not shown). Weak signal also was present in some of the chondrocytes of 
the vertebrae, suggesting that S55 was replicating in these cells as well. Although 

20 the frequency of positive bone marrow cells was low, the signal was very intense 
over individual positive cells. This result strongly suggests that S55 replicates in 
vivo in a subset of cells contained in the bone marrow. 

EXAMPLE 11 
Other Sindbis Group Viruses 
25 It was of interest to determine if the ability to replicate in the bone 

marrow of mice was unique to S55 or was a general feature of other viruses, both 
Sindbis and non-Sindbis viruses, in the Sindbis group. Six 38-day-old female CD- 
1 mice were inoculated iv with 25 pi of diluent containing 10 3 PFU of S55, 
Ockelbo82, Girdwood S.A., TR339, or TRSB. At 2, 4 and 6 days post- 
30 inoculation two mice from each group were sacrificed and whole blood, serum, 
brain (including brainstem), right quadricep, and both femurs were harvested for 
virus titration. 
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The results of this experiment were similar to those with S55. 
TRSB infected animals had no vims detectable in serum or whole blood in any 
animal at any time, and with the other viruses tested, no virus was detected in the 
serum or whole blood of any animal beyond two days post-inoculation (detection 
limit, 25 PFU/mJ). Neither TRSB nor TR339 was detectable in the brains of 
infected animals at any time post-inoculation. S55, Girdwood S.A., and 
Ockelbo82 were present in the brains of infected animals sporadically with the 
titers being at or near the 75 PFU/g level of detection. All the tested viruses were 
found sporadically at or slightly above the 50 PFU/g detection limit in the right 
quadricep of infected animals except for a single animal four days post-inoculation 
with TRSB which had nearly 10 5 PFU/g of virus in its quadricep. 

The frequency at which the different viruses were detected in bone 
marrow varied widely, with S55 and Girdwood S.A. being the most frequently 
isolated (five out of six animals) and Ockelbo82 and TRSB being the least 
frequently isolated from bone marrow (one out of six animals and two out of six 
animals, respectively) (Table 4). Girdwood S.A. and S55 gave nearly identical 
profiles in all tissues. Girdwood S.A., unlike S.A.AR86, is not neurovirulent-in 
adult mice (Example 4), suggesting that the adult neurovirulence phenotype is 
distinct from the ability of the virus to replicate efficiently in bone marrow. 
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EXAMPLE 12 
Virus Persistence in Bone Marrow 
The next step in our investigations was to evaluate the possibility 
that S.A.AR86 persisted long-term in bone marrow. S51 is a moiecularly cloned, 
attenuated mutant of S55. S51 differs from S55 by a threonine for isoieucine 
5 substitution at amino acid residue 538 of nsPl and is attenuated in adult mice 
inoculated intracerebrally. Like S55, S51 targeted to and replicated in the bone 
marrow of 37-day-old female CD-I mice following ic inoculation. Mice were 
inoculated ic with 500 PFU of S51 and sacrificed at 4, 8, 16, and 30 days post- 
inoculation for determination of bone marrow and serum titers. At no time post- 
10 inoculation was virus detected in the serum above the 6.25 PFU/ml detection limit. 
Virus was detectable in the bone marrow samples of both animals sampled at four 
days post-inoculation and in one animal eight days post- inoculation (Table 5). No 
virus was detectable by titration on BHK-21 cells in any of the bone marrow 
samples beyond eight days post-inoculation. These results suggested that the 
15 attenuating mutation present in S5 1 , which reduces the neurovirulence of the virus, 
did not impair acute viral replication in the bone marrow. 

It was notable that the plaque size on BHK-21 cells of virus 
recovered on day 4 post-inoculation was smaller than the size of plaques produced 
by the inoculum virus, and that plaques produced from virus recovered from the 
20 'day 8 post-inoculation samples were even smaller and barely visible. This 
suggests a strong selective pressure in the bone marrow for virus that is much less 
efficient 'in forming plaques on BHK-21 cells. 

To demonstrate that S51 virus genomes were present in bone 
marrow cells long after acute infection, four to six- week-old female CD-I mice 

25 were inoculated ic with 500 PFU of S51. Three months post-inoculation two 
animals were sacrificed, perfused with paraformaldehyde and decalcified as 
described in Example 8. The heads and hind limbs from these animals were 
paraffin embedded, sectioned, and probed with a S.A.AR86 specific p 5 S]-UTP 
labeled riboprobe derived from clone pDS-45. In situ hybridization signal was 

30 clearly present in discrete cells of the bone and bone marrow of the legs (data not 
shown). Furthermore, no in situ hybridization signal was detected in an adjacent 
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control section probed with an influenza virus HA gene specific riboprobe. As the 
relative sensitivity of in situ hybridization is reduced in decalcified tissues (Peter 
Charles, personal communication), these cells likely contain a relatively high 
number of viral sequences, even at three months post- inoculation. No in situ 
5 hybridization signal was observed in mid-sagital sections of the heads with the 
S.A.AR86 specific probe, although focal lesions were observed in the brain 
indicative of the prior acute infection with S51. 



TABLE 5 



S51 Titers in 


Bone Marrow Following IC Inoculation nf *nn ptttt 


Days Post- 
Inoculation 


Titers {Total PFU/AnimaD 


Limit of 
Detection 


Animal A Animal B 


4 


2100 


380 


62.5 


8 


62.5 


N.D. a 


62.5 


16 


N.D. 


N.D. 


. 62.5 


30 


N.D. 


N.D. 


62.5 



a "N.D." indicates that the vims titers were below the limit of detection. 
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Example 13 

Replication of S.A. A.RS6 within Bone/Joint Tissue of Adult Mice 

Several old world alphaviruses, including Ross River Virus, 
Chikungunya vims, Okelbo82, and S.A.AR86 are associated with acute and persistent 
5 arthritis/arthralgia in humans. Molecular clones of several Sindbis group viruses, 
including S.A.AR86,. were used to investigate alphavirus replication within bone/joint 
tissue. 

Following intravenous inoculation of S.A.AR86 into adult CD-I mice, 
viral replication was observed in bone/joint tissue, but not surrounding muscle tissue of 

10 the hind limbs. Infectious virus was detectable 24 hrs post-infection; however, viral 
titer within bone/joint tissue was maximal 72 hours post-infection. Fractionation of 
hind limbs from infected animals revealed that the hip and knee joints were the 
predominant sites of viral replication. Replication within bone/joint tissue appears to be 
a common trait of Sindbis-group viruses, since the laboratory strains TR339 and TRSB 

15 also replicated within bone/joint tissue. In situ hybridization and S.A.AR86 based 
double promoter vectors expressing green fluorescent protein were used to further 
localize S.A.AR86 infected cells within bone/joint tissue. Green fluorescent protein 
expression was detected in bone/joint tissue for at least one month post- inoculation. 
These studies demonstrated that cells within the endosteum of synovial joints were the 

20 predominant site of S.AAR86 replication. 
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THAT WHICH IS CLAIMED IS: 

1. A method of introducing and expressing heterologous RNA 
in bone marrow cells, comprising: 

(a) providing a recombinant alphavirus, said alphavirus 
containing a heterologous RNA segment, said heterologous RNA segment 
comprising a promoter operable in said bone marrow cells operativeiy associated 
with a heterologous RNA to be expressed in said bone marrow cells; and then 

(b) contacting said recombinant alphavirus to said bone marrow 
cells so that said heterologous RNA segment is introduced and expressed therein. 

2. A method according to claim 1, wherein said contacting step 
is carried out in vitro. 

3 . A method according to claim 1 , wherein said contacting step 
is carried out in vivo in a subject in need of such treatment. 

4. A method according to claim 1, wherein said heterologous 
a protein or peptide. 

5. A method according to claim 1, wherein said heterologous 
an immunogenic protein or peptide. 

6. A method according to claim 1, wherein said heterologous 
RNA encodes an antisense oligonucleotide or a ribozyme. 

7. A method according to claim 1, wherein said alphavirus. is 
an Old World alphavirus. 

8. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of SF group and SIN group alphaviruses. 
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9. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of Semliki Forest virus, Middeiburg virus, 
Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, Bannah Forest 
virus, Getah virus, Sagiyama virus, Bebaru vims, Mayaro virus, Una virus, 
Sindbis virus, South African Arbovirus No. 86, Ockelbo virus, Girdwood S.A. 
virus, Aura virus, Whataroa virus, Babanki virus, and Kyzylagach virus. 

10. A method according to claim 1, wherein said alphavirus is 
South African Arbovirus No. 86. 



11. A method according to claim 1, wherein said alphavirus is 
Girdwood S.A. 



12. A method according to claim 1, wherein said alphavirus is 
Sindbis strain TR339. 

13. A helper cell for expressing an infectious, propagation 
defective, Girdwood S.A. virus particle, comprising, in a Girdwood S.A.- 
pennissive cell: 

(a) a first helper RNA encoding (i) at least one Girdwood S.A. 
structural protein, and (ii) not encoding at least one other Girdwood S.A. structural 
protein; and 

(b) a second helper RNA separate from said first helper RNA, 
said second helper RNA (i) not encoding said at least one Girdwood S.A.. 
structural protein encoded by said first helper RNA, and (ii) encoding said at least 
one other Girdwood S.A. structural protein not encoded by said first helper RNA, 
and with all of said Girdwood S. A. structural proteins encoded by said first and 
second helper RNAs assembling together into Girdwood S.A. particles in said cell 
containing said replicon RNA; 

and wherein the Girdwood S.A. packaging segment is deleted from 
at least said first helper RNA. 
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14. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Gixdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said Girdwood S.A. packaging segment is deleted from at 
least one of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
second helper RNA are all separate molecules from one another. 

15. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one Girdwood S.A. structural protein not 
encoded by said first helper RNA. 

16. The helper cell according to claim 13, wherein said first 
helper RNA encodes both the Girdwood S.A. El glycoprotein and the Girdwood 
S.A. E2 glycoprotein, and wherein said second helper RNA encodes the Girdwood 
S.A. capsid protein. 

17. A method of making infectious, propagation defective, 
Girdwood S.A. virus panicles, comprising: 

transfecting a Girdwood S.A.-permissive cell according to claim 13 
with a propagation defective replicon RNA, said replicon RNA including said 
Girdwood S.A. packaging segment and an inserted heterologous RNA; . 

producing said Girdwood S.A. virus particles in said transfected 

cell; and then 

collecting said Girdwood S.A. virus particles from said cell. 



SUBSTITUTE SHEET {RULE 5?K\ 



WO 98/36779 PCT/US98/02945 

-53- 

18. Infectious Girdwood S.A. virus particles produced by the 
method of Claim 17. 



19. Infectious Girdwood S.A. virus panicles ro^lainin^ a 
replicon RNA encoding a promoter, an inserted heterologous RNA, and wherein 

5 RNA encoding at least one Girdwood S.A. structural protein is deleted therefrom 

so that said Girdwood S.A. virus panicle is propagation defective. 

20. A pharmaceutical formulation comprising infectious 
Girdwood S.A. virus panicles according to claim 18 or 19 in a phannaceutically 
acceptable carrier. 

10 21- A helper cell for expressing an infectious, propagation 

defective, TR339 virus particle, comprising, in a TR339-permissive cell: 

(a) a first helper RNA encoding (i) at least one TR339 structural 
protein, and (ii) not encoding at least one other TR339 structural protein; and 

(b) a second helper RNA separate from said first helper RNA, 
15 said second helper RNA (i) not encoding said at least one TR339 structural protein 

encoded by said first helper RNA, and (ii) encoding said at least one other TR339 
structural protein not encoded by said first helper RNA, and with all of said 
TR339 structural proteins encoded by said first and second helper RNAs 
assembling together into TR339 panicles in said cell containing said replicon 
20 RNA; 

and wherein the TR339 packaging segment is deleted from at least 
said first helper RNA. 

22. The helper cell according to claim 21, further containing a 

replicon RNA; 

25 said replicon RNA encoding said TR339 packaging segment and an 

inserted heterologous RNA; 

wherein said TR339 packaging segment is deleted from at least one 
of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
30 second helper RNA are all separate molecules from one another. 
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23. The helper ceii according to ciaim 21, further containing a 

replicon RNA; 

said replicon RNA encoding said TR339 packaging segment and an 
inserted heterologous RNA; 

5 wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one TR339 structural protein not encoded by 
said first helper RNA. 



10 24. The helper cell according to claim 21, wherein said first 

helper RNA encodes both the TR339 El glycoprotein and the TR339 E2 
glycoprotein, and wherein said second helper RNA encodes the TR339 capsid 
protein. 

25. A method of making infectious, propagation defective, 
TR339 virus particles, comprising: 

transfecting a TR339-perrnissive cell according to claim 21 with a 
propagation defective replicon RNA, said replicon RNA including said TR339 
packaging segment and an inserted heterologous RNA; 

producing said TR339 virus particles in said transfected cell; and 

then 

collecting said TR339 virus particles from said cell. 

26. Infectious TR339 virus particles produced by the method of 

Claim 25. 

27. Infectious TR339 virus panicles containing a replicon RNA 
25 encoding a promoter, an inserted heterologous RNA, and wherein RNA encoding 

at least one TR339 structural protein is deleted therefrom so that said virus particle 
is propagation defective. 

28. A pharmaceutical formulation comprising infectious TR339 
virus particles according to Claim 26 or 27 in a pharmaceutical^ acceptable carrier. 
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29. A recombinant UNA comprising a cDNA coding for an 
infectious Girdwood S.A. virus RNA transcript and a heterologous promoter 
positioned upstream from said cDNA and operatively associated therewith. 

30. An infectious RNA transcript encoded by a cDNA according 

to claim 29. 



31. An infectious RNA according to claim 30, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 



32. Infectious viral panicles containing an RNA transcript 
according to claim 30. 



33. A recombinant DNA comprising a cDNA coding for a 
Sindbis strain TR339 RNA transcript and a heterologous promoter positioned 
upstream from said cDNA and operatively associated therewith. 

34. An infectious RNA transcript encoded by a cDNA according 

to claim 33. 

35. An infectious RNA according to claim 34, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 

36. Infectious viral particles containing an RNA transcript 
according to claim 34. 
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Nucleotide Sequence of S.A.AR86 

1 ATTGCCGGCG TACTACACAC TATTCAATCA AACACCCCAC CAATTCCACT ACCATCACAA TCCAGAACCC ACTACTT AAC CTACaCCTAG ACCCTCACAC 
101 TCCCTTTCTC CTCCAACTCC AAAACACCTT CCCCCAATTT CACCTACTaC CACACCACCT CACTCCAAAT CACCATCCTA ATCCCACACC ATTTTCCCAT 
:0t CTCCCCAGTA AACTAATCCA CCTCCACCTT CCTACCACAC CCACCATTTT CCACATACCC ACCCCACCCC CTCCTACAAT GTTTTCCGAG CACCAGTACC 
»1 ATTGCGTTTG CCCCATGCGT AGTCCAGAAG ACCCGGACCG CATGATGAAA TATGCCAGCA AACTCGCGGA AAAAGCATGT AAGATTACAA ACAAGAACTT 
401 GCATGAGAAG ATCAAGGACC TCCGGACCGT ACTTGATACA CCGGATGCTG AAACGCCATC actctgcttc CACAACGATG ttacctgcaa cacccgtgcc 
501 GAGTACTCCG tcatgcagga CGTGTACATC AACCCTCCCG GAACTATTTA ccaccagcct atgaaaggcg tgcggaccct GTACTGGATT cgcttcgaca 
601 CCACCCAGTT CATGTTCTCG GCTATGGCAG GTTCGTACCC TGCATACAAC ACCAACTCCG CCGACGAAAA ACTCCTTCAA CCGCGTAACA TCCGACTCTG 

701 CAGCACAAAG CTGAGTGAaG gcaggacagg aaagttgtcg ataatgagga ACAAGCAGTT GAAGCCCGGG tcaccggttt atttctccgt tggatcgaca 
801 CTTTACCCAG aacacagagc caccttccag agctggcatc ttccatccgt CTTCCACTTG AAAGGAAAGC agtcgtacac ttcccgctgt gatacagtgg 

901 TGAGCTCCGA AGGCTACCTA GTGAAGAAAA TCACCATCAG TCCCGGCATC ACCGGAGAAA CCGTGGGATA CCCGCTTACA AACAATAGCG AGGGCTTCTT 

1001 cctatccaaa GTTACCGATA CAGTAAAAGG agaacgggta tccttccccg tgtccaccta tatcccggcc accatatgcg atcagatgac cgccataatg 
1101 cccacggata tctcacctga cgatgcacaa aaacttctcg ttgggctcaa ccagcgaatc gtcattaacg gtaagactaa caggaacacc aataccatgc 

1201 AAAATTACC7 TCTCCCAATC A7TGCACAAG GGTTCAGCAA ATGGCCCAAG GAGCGCAAAG AAGATCTTGA CAATGAAAAA atgctgggca ccagagagcg 

1301 CAAGCTTACA TATGGCTGCT 7GTGGGCGTT TCGCaCTAAG AAAGTCCACT CGTTCTATCG CCCACCTGGA ACGCaGaCCA TCGTAAAACT CCCAGCCTCT 

1401 TTTAGCGCTT TCCCCATGTC ATCCGTATGG ACTACCTCTT TGCCCA7GTC CCTGACGCAG AAGATGAAAT TGGCaTTACA ACCAAAGAAG GAGGAAAAAC 

1501 TGCTGCAAGT CCCCGAGGAA TTAGTTATGG AGGCCAAGGC TGCTTTCGAG GATGCTCAGG AGGAATCCAG AGCGGAGAAG CTCCGAGAAG cactcccacc 

t601 ATTAGTGGCA GACAAAGGTA TCGAGGCAGC TCCGGAAGTT GTCTGCGAAG TGGACGCGCT CCAGGCGGAC ACCGGAGCAG CACTCC7CGA AACCCCCCCC 

1701 CGTCATGTAA GGATAATACC TCAAGCAAAT GACCGTATGA TCGGACAGTA TATCCTTCTC TCCCCGATCT CTGTCCTGAA GAACCC7AAA CTCGCACCAG 

t80t CACACCCGCT AGCAGACCAG GTTAAGATCA TAACGCACTC CGGAACATCA GGAACGTATG CACTCGAACC ATACGACGCT AAAG7ACTGA TGCCAGCAGG 

1901 AAGTGCCGTA CCATGGCCAG AATTCTTACC ACTCAGTCAG AGCGCCACCC TTGTGTACAA CCAAAGaGaG TTTGTGAACC CCAACCTCTA CCATATTCCC 

:00l ATGCACGCTC CCGCTAAGAA TACaGAAGAG GAGCAGTACA ACGTTACAAA CCCAGACCTC GCAGAAACAG ACTACCTCTT TGACGTCGAC AAGAACCCAT 

2101 GCGTTAAGAA GCAAGAAGCC TCACGACTTG TCCTTTCCGG AGAACTGACC AACCCCCCCT ATCACCAACT acctcttgag ggactgaaga ctcgacccgc 

2201 cgtcccgtac aaggttgaaa caataggagt gataggcaca ccaccatcgg gcaagtcacc tatcatcaag tcaactgtca cgccacgtca tcttcttacc 
2301 AGCCCAAAGA aagaaaactg ccgcgaaatt gacgccgacg tgctacggct gaggggcatg cagatcacgt cgaagacagt ggattccgtt atgctcaacg 
:*h gatcccacaa agccctagaa gtgctgtatg ttgacgaagc gttccgctgc cacgcagcag cactacttgc cttgattgca atcgtcagac cccgtaagaa 
zjoi cgtagtacta tgcggagacc ctaagcaatg cggattcttc aacatgatgc aactaaaggt ACATTTCAAC CACCCTGAAA AACACATATG taccaagaca 
:at ttctacaagt ttatctcccg ACGTTCCaCA cagccagtca CGGCTATTGT ATCGACACTG CATTACGATG GAAAAATGAA AACCACAAAC ccgtgcaaga 
rroi AGAACATCGA aatccacatt acaggggcca cgaagccgaa gccaccggac atcatcctca catgtttccg cgggtgggtt aagcaactcc aaatcgacta 
zto\ tccccgacat gaggtaatga cagccgccgc ctcacaaggg ctaaccagaa aaggagtata tgccgtccgg caaaaagtca atcaaaaccc cctgtacgcg 
2901 ATCACATCAG accatgtgaa cgtcttgctc acccgcactg aggacagcct agtatggaaa actttacagg cccacccatg gattaaccag ctcactaacg 
3001 TACCTAAAGG aaattttcag cccaccatcg aggactggca agctgaacac aagggaataa ttcctgcgat AAACACTCCC cctccccgta ccaatccgtt 
3101 CAGCTGCAAG actaaccttt cctcggcgaa agcactggaa cccatactcc ccaccccccg tatcctactt accccttccc agtggagcca cctcttccca 

3201 CAGTTTGCGG ATGACAAACC ACACTCCGCC ATCTACCCCT TAGACGTAAT TTCCATTAAG TTTTTCGGCA TCCaCTTGaC AAGCCGCCTG TTTTCCAAAC 
3301 ACAGCATCCC gttaacctac catcctcccg actcagccag cccactacct CATTCCCACA ACAGCCCACG AACACGCAAG tatccgtacc ATCACCCCGT 
UOl TGCCGCCCAA ctctcccgta gatttccgct cttccaccta cctccgaaag ccacacagct tcatttccag acccccagaa ctagagttat ctctccacag 
3501 cataacttgg tcccactgaa ccgcaatctc cctcacgcct TAGTCCCCCA CCACAACCAG AAACAACCCG GCCCGGTCGa AAAA7TCTTG ACCCAGTTCA 
3601 AACACCACTC CGTACTTGTG ATCTCAGAGA AAAAAATTGA AGCTCCCCaC AAGaGAATCG AATGGATCGC CCCGaTTGGC ATAGCCGGCG CAGATAAGAA 
3701 CTACAACCTG GCTTTCGCGT TTCCGCCGCA GGCACGGTAC GACCTGGTGT TCATCAATAT TGGAACTAAA TACAGAAACC ATCACTTTCA ACAGTGCGAA 
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3801 CACCACCCCG CGACCTTGAA AACCCTTTCG CGTTCCGCCC TGAACTCCCT TAACCCCGGA CCCACCCTCG TGGTGAAGTC ctaccgttac cccgaccgca 

3901 atagtgagga cgtagtcacc gctcttccca gaaaatttct cagagtgtct gcagcgaggc cagagtgcgt ctcaagcaat acagaaatgt acctgatttt 
4001 ccgacaacta gacaacagcc gcacacgaca attcaccccg catcatttga attgtctgat ttcgtccgtg tacgagggta caagagacgg agttggagcc 
4101 gcaccgtcgt accgtactaa aagggagaac attcctgatt gtcaagagga agcagttgtc aatgcagcca atccactgcg cagaccagga gaaggactct 

4201 CCCGTCCCAT CTATAAACGT TGGCCGAACA GT7TCACCCA TTCAGCCACA GAGACAGGTA CCCCAAAACT GACTGTGTCC CAAGGAAAGA AAGTGATCCA 

4301 CGCGGTTGCC CCTGATTTCC CGAAACACCC AGAGGCAGAA GCCCTGAAAT TGCTGCAAAA CGCCTACCAT CCAGTCGCAG ACTTAGTAAA TGAACATAAT 

-UOI ATCAAGTCTG TCGCCATCCC ACTGCTATCT ACAGGCATTT ACGCAGCCGG AAAAGACCGC CTTGACGTAT CACTTAACTG CTTGACAACC GCGCTAGACA 

4501 GAACTGATGC GGACGTAACC ATCTACTCCC TGGATAAGAA GTGGAAGGAA AGAATCGACG CCGTGCTCCA ACTTAAGGAG TCTGTAACTG AGCTGAAGGA 

4601 TGAGGATATG GAGATCCACG ACGAGTTAGT ATGCATCCAT CCGGACAGTT GCCTGAAGGG AAGAAAGGGA TTCAGTACTA CAAAAGGAAA GTTGTATTCG 

4701 TACTTTGAAG GCACCAAATT CCATCAACCA GCAAAAGATA TGGCGGAGAT AAAGGTCCTG TTCCCAAATG ACCAGGAAAG CAACGAACAA CTGTGTGCCT 

4 SOI ACATATTGGG GGAGACCATG GAAGCAATCC CCGAAAAATG CCCCGTCCAC CACAACCCCT CGTCTACCCC GCCAAAAACG CTCCCGTGCC TCTCTATGTA 

4901 TGCCATGACG CCAGAAAGCG TCCACAGACT CAGAAGCAAT AACGTCAAAG AAGTTACAGT ATCCTCCTCC ACCCCCCTTC CAAAGTACAA AATCAAGAAT 

1001 GTTCAGAAGG TTCAGTCCAC AAAAGTAGTC ctgtttaacc cgcatacccc cccattcgtt CCCGCCCGTA AGTACATAGA AGCACCAGAA cagcctgcag 

3101 ctccgcctgc acaggccgag gaggcccccg gagttgtagc gacaccaaca ccacctgcag ctgataacac ctcgcttgat gtcacggaca tctcactgga 

5:01 CATGGAAGAC agtagcgaag gctcactctt ttcgagcttt agcggatcgg acaactaccg aaggcaggtg gtcgtgcctg acgtccatgc cgtccaagag 

5301 cctgcccctg ttccaccccc aaggctaaag AAGATCGCCC ccctggcagc ggcaagaatg caggaagagc caactccacc ggcaagcacc agctctccgg 

3*01 acgagtcccttcacctttct tttgatggcg tatctatatc cttccgatcc cttttcgacg gagagatggc ccgcttggca GCGCCACAAC ccccggcaag 

5501 TACATGCCCT acggatgtgc ctatgtcttt cggatcgttttccgacggag agattgagga gttcagccgc agagtaaccg agtcggagcc cgtcctgttt 

5601 gcgtcatttg aacccggcga actgaactca attatatcgt cccgatcagc cgtatctttt ccaccacgca agcagagacg tagacgcagg agcaggagga 

5701 CCGAATACTG tctaaccggg gtaggtgggt ACATATTTTC GACGGACACA GGCCCTGGGC ACT7GCAAAA gaagtccgtt ctgcagaacc agcttacaga 

5301 ACCGACCTTG gagcgcaatg ttctggaaag aatctacgcc cccgtgctcg acacgtcgaa agaggaacag ctcaaactca cgtaccagat gatgcccacc 

S901 GAAGCCAACA AAAGCaCGTA ccactctcga aaagtagaaa accagaaagc cataaccact cagccactcc tttcaggcct acgact g tat aactctccca 

6001 CAGATCAGCC agaatgctat aagatcacct acccgaaacc atcgtattcc agcagtctac CAGCGAACTA ctctgaccca aactttgctg tagctgtttg 

6101 TAACAACTAT ctgcatgaca attaccccac cgtagcatct tatcagatca CCGACGAGTA cgatgcttac ttggatatgg TAGACGGGAC agtcgcttgc 

6201 CTAGATACTG caactttttg cccccccaag cttagaagtt acccgaaaag acacgagtat agagccccaa acatccccag tcccgttcca tcagcgatgc 

6301 agaacacctt gcaaaacgtg ctcattgccg cgactaaaag aaactgcaac gtcacacaaa tgcgtgaact gccaacactg gactcagcga cattcaacgt 

6401 TGAATGCTTT CGAAAATATG catgcaatca ccagtattgg gaggagtttc cccgaaagcc aattagcatc actactgagt tcgttacccc atacgtggcc 

6501 AGACTGAAAG gccctaaggc cgccgcactc ttcgcaaaga CCCATAATTT CGTCCCATTG CaaCAAGTGC ctatggatag attcgtcatg CACATGAAAA 

6601 CAGACGTCAA agttacacct cccacgaaac acacagaaga aagaccgaaa gtacaagtga tacaagcccc agaacccctg gcgaccgctt acctatgccg 

6701 gatccaccgg gagttagtgc ccagccttac agccgttttg ctacccaaca ttcacaccct ctttgacatg tcggcgcagg ACTTTGATGC AATCATAGCA 

6*01 CAACACTTCA AGCAAGGTCA CCCGGTACTG GAGACGGATA TCCCCTCG7T CGACAAAAGC CAAGACGACG CTATGCCGTT AACCCGCCTG ATG ATCTTGG 
6901 AAGACCTGGG TCTGCACCAA CCACTACTCG ACTTCATCGA GTGCCCCTTT CGAGAAATAT CATCCACCCA TCTGCCCACG GGTACCCGTT TCAAATTCGG 
7001 GCCGATCATG AAATCCGGAA TCTTCCTCAC GC7CTTTCTC A A CACAGTTC TGAATGTCCT TaTCGCCAGC AGAGTaTTCG AGGACCCGCT TAAAACGTCC 
7101 AAATCTGCAC catttatcgc CCACGACAAC ATTATACACG GAGTAGTATC TGACAAAGAA ATGGCTGAGA ggtgtcccac ctccctcaac atggaggtta 
7701 AGATCATTGA cccagtcatc CGCGAGAGAC caccttactt CTCCGGTGGA TTCATCTTCC AAGATTCGCT tacctccaca gcctctcccg tccccgaccc 
TJOl CTTGAAAAGG CTGTTTAAGT TCCGTAAACC CCTCCCAGCC CACGATGAGC AAGACGAAGA CaGAAGaCGC CCTCTCCTAG ATCAAACAAA GGCGTGGTTT 
7401 AGAGTAGGTA TAACAGACAC CTTAGCAGTG gccgtgccaa CTCGGTATGA GGTAGACAAC ATCACACCTG TCCTGCTGGC ATTGAGAACT TTTGCCCAGA 
7501 G C AAA AC AGC ATTTCAAGCC ATCaGAGCCG AAATAAAGCA TCTCTACGCT CGTCCTAAAT AGTCAGCATA GTACATTTCA TCTGACTAAT ACCACAACAC 
7601 CACCACCATG AATAGAGGAT TCTTTAACaT CCTCCCCCCC CGCCCCTTCC CAGCCCCCAC TCCCATGTGG AGGCCCCGGA GAAGGAGCCA GGCGGCCCCG 
7701 ATCCCTCCCC GCAATGGGCT GGCTTCCCAA ATCCAGCAAC TGACCaCAGC CCTCACTCCC CTAGTCATTG CaCACCCAAC TAGACC7CAA aCCCCACCCC 
7801 CACGCCCCCC GCCCCCCCAG AAGAAGCAGG CGCCAAAGCA ACCACCCAAG CCGAAGAAAC CAAAAACACA cgagaagaag AAGAAGCAAC CTGCAAAACC 
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TBI CAAACCCCGA AACACACACC CTATCCCACT TAACTTCCAG GCCGACaGAC TCTTCCACCT CAAAAATCAC CACCCACATC TCATCCCCCA CCCACTGGCC 
S00I ATCGAAGGAA ACCTAATGAA aCCACTCCAC CTGAAACGAA CTATTGACCA CCCTCTGCTA TCAAACCTCA AATTCACCAA GTCGTCAGCA TACGACATCG 
AGTTCCCACA CTTCCCCCTC AACATGAGAA CTGACGCGTT CACCTACACC AGTGAACACC CTGAAGGGTT CTACAACTCG CACCACCGaG CCGTGCAGTA 

tagtccaggc agatttacca fccccccccc AGTAGCAGCC AGACGAGaCA gtggtcgtcc gattatggat aactcacccc cggttctcgc catagtcctc 
gcaggcgctg atgacgcaac aacaaccgcc ctttcggtcctcacctcgaa tagcaaaggg aagacaatca acacaacccc cgaaggcaca gaagagtggt 
ctcctgcacc actggtcacc gccatgtcct tccttggaaa cgtgagcttc ccatccaatc gcccgcccac atcctacacc cgcgaaccat ccagagctct 
ccacatcctc gaagacaacg tgaaccacca ggcctacgac accctgctca accccatatt cccgtcccga tcgtccggca caagtaaaag aagcgtcact 
gacgacttta ccttgaccag cccgtacttg ggcacatgct cgtactgtca ccatactgaa ccctccttta cccccattaa gatcgaccag gtctcgcatg 
aagcggacga caacaccata cgcatacaga cttccgccca gtttgcatac caccaaaccg cagcaccaag ctcaaataag tacccctaca tgtccctcga 
gcacgatcat actctcaaag aaggcaccat cgatgacatc aagatcagca cctcagcacc gtctacaagg cttagctaca aaggatactt tctcctcccg 

AAGTGTCCTC CAGGGGACAG CGTAACGCTT ACCATAGCGA CTAGCAACTC AGCAACGTCA TCCACAATCG CCCGCAAGAT AAAACCAAAA TTCGTGCGAC 
CGCAAAAATA TGaCCTACCT cccgttcacg GTAAGAAGAT TCCTTGCACA GTGTACGACC GTCTCAAAGA AACAACCGCC GGCTACATCA CTATGCACAG 
GCCCGCACCG CATCCCTATA CATCCTATCT GGAGGAATCA TCAGGGAAAG TTTACCCGAA CCCACCATCC GGGAACAACA ttacgtacga gtgcaagtgc 

ggcgattaca agacccgaac cgttacgacc cgtaccgaaa tcaccggctg cacccccatc aagcagtccg tcccctataa gaccgaccaa accaagtgcg 

TCTTCAACTC CCCCGACTCG ATCAGaCaCG CCCACCACAC GGCCCAAGGG AAATTCCATT TCCCTTTCAA CCTGATCCCG AGTACCTGCA TCGTCCCTGT 
TCCCCACCCG CCGAACGTAC TACACCGCTT TAAACACATC ACCCTCCAAT TAGACACAGA CCATCTGACA TTCCTCACCA CCAGGAGACT AGCGGCAAAC 
CCCGAACCAA CCACTGAATG GaTCaTCGGA aaCACGGTTA GAAACTTCAC CGTCGACCCA CATCCCCTGG AATACATATG GGGCAATCAC CAACCAGTAA 
GCCTCTATCC CCAAGAGTCT GCACCAGGAG ACCCTCACCG ATGCCCACAC GAAATAGTAC ACCATTACTA TCATCCCCAT CCTGTGTACA CCATCTTAGC 
CGTCGCATCA CCTCCTGTGG CGATGATGAT TCGCGTAACT GTTGCAGCAT TATCTCCCTC TAAACCCCCC CGTCACTGCC TGACGCCATA TGCCCTCCCC 
CCAAATCCCG TCATTCCAAC ttccctggca cttttgtcct GTGTTACGTC CCCTAATCCT gaaacattca CCGAGACCAT CAGTTACTTA TGGTCCAACA 
CCCACCCGT7 CTTCTCCGTC CAGCTGTGTA TACCTCTGCC CGCTGTCGTC GTTCTAATGC CCTCTTCCTC ATCCTGCCTC CCTTTTTTAC TCGT7CCCGG 
CCCCTACCTG CCGAAGCTAG ACCCCTACCA ACATGCCACC ACTCTTCCAA ATGTGCCACA CaTACCGTAT AACCCACTTG TTGAAAGGCC ACCCTACCCC 
CCCCTCAATT TCGAGATTAC TGTCATGTCC TCGGACGTTT TGCCTTCCAC CAACCAAGAG TaCATTACCT CCAAATTCAC CACTCTCCTC CCC7CCCCTA 
AACTCACATG CTCCGCCTCC TTCGAATGTC ACCCCCCCGC TCACGCACAC TATACCTCCA ACGTCTTTGG AGGGGTGTAC CCCTTCaTGT CGGGaCCAGC 
ACAATCmTTGCCACACTG AGAACAGCCA GaTCAGTGAO GCGTACGTCG AATTCTCAGT AGATTCCGCG ACTGACCACG CCCACGCGAT TAAGGTGCAT 
ACTCCCCCCA TGAAAGTAGG ACTCCGTATA GTGTACGGGA ACACTACCAG TTTCCTAGAT GTGTACGTGA ACCGAGTCAC ACCAGGAACG TCTAAAGACC 
TGAAAGTCAT AGCTCCACCA ATTTCACCAT TCTTTACACC attccatcac AAGCTCCTTA tcaatcccgg cctcctgtac aactatgact ttccccaata 
ccgacccatg aaaccaggag cctttccaga cattcaacct acctccttga CTACCAAACA cctcatcccc agcacagaca ttaggctactcaagccttcc 

GCCAAGAACG TGCATCTCCC GTaCaCCCAG CCCCCATCTC GATTCGaGAT GTGGAAAAAC AACTCACCCC GCCCACTGCA GGAAACCGCC CCTTTTGGGT 
GCAAGATTCC AGTCAATCCG CTTCGAGCGG TCCACTCCTC ATACGGGAAC ATTCCCATTT CTATTGACAT CCCCAACCCT GCCTTTATCA GGACATCACA 
TGCACCACTG CTCTCAACAG TCAAATGTGA TGTCACTGAG TGCACTTATTCAGCCGACTTCGGAGGGATC GCTACCCTGC AGTATGTATCCGACCGCCAA 
GGACAATCCC CTGTACATTC CCATTCGAGC ACaCCAACCC TCCAAGAGTC GACAGTTCAT CTCCTGGAGA AAGGAGCGGT GACAGTACAC TTCACCACCG 
CGACCCCACA CCCGAACTTC ATTCTATCGC TGTCTGGTAA GAAGACAACA TGCAATGCaG AATGCAAACC ACCAGCTGAT CATATCCTGA GCACCCCGCA 
CAAAAATCAC CAAGAATTCC AAGCCGCCAT CTCAAAAACT TCATGGAGTT CCCTCTTTGC CCTTTTCCGC GGCGCCTCGT CGCTATTAAT tataggactt 

atcatttttg cttgcaccat catcctcact agcacaccaa catgacccct acgccccaat gacccgacca gcaaaactcg ATGTACTTCC gacgaactga 

TGTGCATAAT CCATCACCCT GGTATATTAG ATCCCCCCTT ACCCCGGCCA ATATACCAAC ACCAAAACTC GACCTATTTC CGACGAAGCG CaGTCCATAA 
TCCTCCCCAC TGTTCCCAAA TAATCACTAT ATTAACCATT TATTCACCCG ACGCCAAAAC TCAATGTATT TCTGAGGAAG CATGCTCCAT AATGCCATGC 

agcgtctgca taacttttta ttatttcttt TATTAATCAA CAAAATTTTC tttttaacat ttc 
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S.A.AR86 

Amino Acid Sequence of the Nonstructural Polyprotein 

I MEKPWNVOV OPQSPFWQL QXSFPQFEW AQQVTPNDHA NARAF5HLAS KLIELEVFTT AT1LDICSAP ARRMFSEHQY HCVCPMR5PE DPDRMMKYAS 

101 KLAEXACXn* NKNLHEKDCD LRTVLDTFOA ETPSLCFHND VTCNTRAEYS VMQDVYINAF GTTYHQAMKG VRTLYWIGFD TTQFMFSAMA GSYPAYNTNW 

Ml ADEXVLEARN IGLCSTKLSE GRTGKLSIMR KKEUCPGSRV YFSVGSTLYP EHRA3LQSWH LPSVFHUCCK QSYTCRCDTV VSCEGYWKK mSPGITCE 

301 TVCYAVTNNS EGFLLCXVTD TVXCERVSFP VCTYtPATIC DQMTGIMATO ISFDOAQKU. VCLWQRfVIN CKTNRNTNTM QWYLLPOAQ GFSXWAKStX 

401 EDLDNEKMLG TRERXLTYCC LWAFJtTXKVH SFYRPPGTQT IVKVFASFSA FFMSSVWTTS LPMSLRQKMK LALQPKKEEX LLQV7EELVM EAKAAFEDAQ 

SOI EESRAEKUIE ALPPLVADKG IEAAAEWCE VEGtQADTGA ALVETFRGHV RIIPQANDRM fCQYfWSPf SVTJCNAKLAP AHPLADQVKI ITHSGRSGRY 

601 AYEPYDAXVL MPAGSAVPW? EFLALSESAT LYYNEREFVN RKLYHtAMHG PAKNTEEEQY KVTKAELAET EYVFDVDKKR CVKXEEASGL VLSGELTNFP 

701 YHELALEGLK TRPAVPYXVE TIGVIGTPGS GKSA10C5TV TARDLVTSGK KENCREIEAO VLRLRGMQIT SKTVDSVMLW GCHKAVEVLY VDEAFRCHAG 

SOI ALLAUAIVR FRXKWLCGO PKQCGFFXMM QLKVHFNHFS KDICTXTFYK FISRRCTQPV TAIVSTLHYD GKMJCTTNPCX KNTEIDITGA TKPXPCOIIL 

001 TCFRCWVKQL QIDYPGHEVM TAAASQGLXR KGVYAVRQKV NENPLYAfTS EHVNVT.LTRT EDRLVWKTLQ CDPWtXQLTN VFKGNFQATI EDWEAEHXGI 

1001 JAAIKSPAPR YNPF5CKTNV CWAKALEPIL ATAGIVLTGC QWSELFPQFA DDKFHSAXYA LDVICDCFFG MDLTSGLFSK QSIPLTYHPA OSARPVAHWD 

1101 NSPGTRKYGY DHAVAAELSR RfPVFQLAGK GTQLDLQTGR TRVISAQHNL VpVNRXLPHA LVPEHKEKQP GFVEXFLSQF KHKSVLV1SE KXEAPHXRJ 

1201 EW1AP1GIAG ADKNYNLAFG FPPQARYDLV FTWGTKYRN HHFQQCEDHA ATUCTLSRSA LNCLNPGGTL WKSYGYADR NSEDWTALA RXFVRVSAAA 

1301 PECVSSNTEM YUFRQLDNS RTRQFTTHHL NCVISSVYEG TRDGVGAAPS YRTXRENIAO CQEEAWNAA KPLGRPCEGV CRAJYXRWPN SFTDSATETG 

1401 TAX LTYCQGK KVTHAVGPOF RXHPEAEAUC tXQNAYHAVA DLVNEKNTKS VAIPLLSTGI YAAGKDRLEV SLNCLTTALD RTDADVTTYC LDKXWKERJD 

1501 AVLQLXESVT ELKDEDMEID OELVWIHPDS CLKGRXGFST TKGKLYSYFE GTKFHOAAKD MAEDCVLFPN DQESNEQLCA YILGETMEAI REKCFVDHNP 

1601 SSSPFKTLPC LCMYAMTPER VHRLRSNNVK EVTYCSTTPL PKYXKNVQK VQCTXWLFN PKTPAFVPAR KYJEAPEQPA APPAQAEEAP GWATPTPPA 

1701 AONTSU3VTO ISLDMED1SE GSLFS5F5GS DKYKRQVWA DVHAVQEPAP VPPPRLXXMA RLAAARMQEE PTPPASTSSA DESLHLSFDG VSISFGSLFD 

I SOI GEMARLAAAQ PPASTCPTDV FMSFGSFSDG E1EELSRAVT ESEFVLFGSF EPGEVNSHS SRSAVSFPPR KQRRRRRSRR TEYCLTGVGG YtFSTDTGPG 

1901 HLOXKSVLQN QLTEPTLERN VLERIYAPVL DTSKEEQUCL RYQMMPTEAN KSRYQ5RKVE NQKATTTERL LSGLRLYKSA TDQPECYKTT YFXPSYSSSV 

:001 PANYSOPKFA VAVCNNYLHE NYPTVASYQI TDEYDAYLOM VOGTVACLDT ATFCPAXLRS YPJCRHEYRAP KTR3AVPSAM QNTLQNYUA ATKRNCNVTQ 

2101 MRELFTLDSA TFNVECFRXY ACNDEYWEEF ARXPIRITTE FVTAYVARLJC GPKAAAL5AX THNLVPLQEV PMDRFVMDMK RDVXVTPGTK HTEERPKVQV 

2C01 1QAAEPLATA YLCGIHRELV RRLTAVLLFN IHTLFDMSAE OFDAflAEHF KQGDPVLETD LASFDKSQDD AMALTGLMIL EDLGVDQPLL DLIECAFGEI 

l»l SSTHLPTGTR FXFGAMMKSG M FLTLFVNTY LNW1ASRVL EERLKTSKCA AFIGDDMIH GWSDKEMAE RCATWLNMEV KIIDAV1GER PPYTCGGFTL 

:*OI QDSVTSTACR VAOPUCRLFK LGKPLPADDE QDEDRRRALL DETKAWFRVG rTDTLAVAVA TRYEVDNTTP VLLALRTFAQ SKRAFQA1RG EKHLYGGFK 
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I MNRGFFNMLG RRPFPAPTAM WRPRRRRQAA PMPARNGLAS QIQQLTTAVS ALV1GQATRP QTPRPRPPPR QKKQAPKQPP KPKKPKTQEX KXXQPAXPKP 

101 GKRQRMALKL EADRLFDVXN EDGDV1GHAL AMEGKVMKPL HVKGT1DHPV USKUCFTXSS AYDMEFAQLP VNMRSEAFTY TSEHPEGFYN WHHGAVQYSG 

201 GRFT1PRGVG GRGDSGRPIM DNSGRWArv LGGAOEGTRT ALSWTWNSK GKT1KTTPEG TEEWSAAPLV TAMCLLCNVS FPCNRPPTCY TREPSRALDI 

301 LEENVNHEAY OTLLNAlLRC GSSGRSKRSV TDOFTLTSPY LGTCSYCHHT EPCFSPDCIE OVWDEAODWT tRIOTSAQFG YDQSGAASSN KYRYMSLEQO 

*01 HTVKEGTWOO IKISTSGPCR RLSYKGYFLL AKCPPGDSVT VSIASSNSAT SCTMARXDCP KFVGREKYOL PPVHGKXIPC TVYDRLKETT AGYTTMHRPG 

501 PHAYT5YLEE SSGKVYAXPP SCKNITYECX CCDYKTGTVT TRTEITGCTA (XQCVAYKSD QTXWVFNSPD SIRHAOHTAQ GKLHLPFXU PSTCMVPVAH 

601 APffWHGFKH ISLQLDTDHL TLLTTRRLGA NPEPTTEWll GNTVRN7TVD RDGLEYTWGN HEPVRVYAQE SAPGDPHGWP HEIVOHYYHR HPVYTTLAVa 

T01 SAAVAMM1GV TVAALCACXA RRECLTPYAL APNAVIPTSL ALLCCVRSAN AETFTETMSY LWSNSQPFFW VQLCIPLAAV WLMRCCSCC LPFLWAGAY 

Ml LAKVDAYEHA TTVPNVPQIP YKALVERAGY APLNLEITVM S5EVLPSTNQ EYrrCKFTTV VPSPKVRCCG StECQPAAHA OYTCXVFCGV YPFMWGGAOC 

901 FCDSENSOMS EAYVELSVOC ATDHAQAIKV HTAAMKVGLR tVYGNTTSFL DVYVNGVTPG TSKDUCVIAG PISALFTPFO HXWINRGLV YNYOFTEYCA 

1001 MKPGAFGDIQ ATSLTSXOLI ASTDIRLLKP SAXNVHVPYT QAASGFEMWK NNSGRPLOET APFGCKlAVN PLRAVDCSYG NIPIStDlPN AAFIRTSDAP 

1 101 LVSTVKCDVS ECTYSADFGG MATUOYVSDR ECQCPVHSHS STATLQESTV HVLEXGA\TV HFSTASPQAN FrvSLCGKXT TCNAECXPPA OHfVSTPHXN 

1201 DQEFQAA1SK TSWSWUFALF GCASSLtllG LMIFACSMML TSTRR 
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| NTTCNCCCCC TACTATACAC TATTCAATCA AACACCCCAC CAATTCCACT ACCATCACAA TOCaCAAOCC ACTACTTAAC CTACACCTAC ACCCGCAGAC 
101 TCCCTTTCTC CTCCAACTCC AAAACaCCTT CCCCCAATTT CACCTACTAG CACACCACCT CACTCCAAAT CACCATCCTA ATCCCACACC ATTTTCGCAT 
201 CTCCCCACTA AACTAATCCA CCTCCACCTT CCTACCACAG CGACGATTTT GGACATAGGC AGCCCACCGG ctcgtagaat GTTTTCCCAG CACCAGTACC 
301 ATTGCGTTTG CCCCATGCGT AGTCCAGAAG ACCCGGACCG CATGATGAAA TATGCCAGCA AACTGGCGGA AAAAGCATGC AAGATTACGA ATAAGAACTT 
401 CCATGAGAAG ATCAAGCACC TCCCGACCCT ACTTGATaCA CCGGATGCTG AAACGCCATC ACTCTGCTTC CaCAACGATG TTACCTGCAA CACGCGTGCC 
iOI GAGTACTCCG TCATCCAGGA CGTCTACATC AACGCTCCCO GAACTATTTA CCATCACGCT ATGAAAGGCG TCCGGACCCT GTACTCGATr CCCTTCGATA 
601 CCACCCAGTT CATGTTCTCG GCTATGCCAG GTTCGTACCC TCCGTACAAC ACCAACTGGG CCGACGAAAA AGTCCTCGAA GCGCGTAACA TCGGACTCTG 
701 CACCACAAAG CTGAGTGAAG GCACGaCAGG AAAGTTGTCG ATAATGAGGA AGAAGGAGTT GAAGCCCGGG TCACGGGTTT ATTTCTCCGT TGGATCGACA 

sot CTTTACCCAG AACACAGAGC CACCTTGCAG acctcgcatc ttccatcgot gttccacctg AAAGGAAAGC actcgtacac ttgcccctct catacagtcg 
901 tgagctgcga aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa ccgtgggata cgcggttaca AACAATAGCG AGCCCTTCTT 

1001 GCTATCCAAA GTTACCGATA CAGTAAAACG AGAACCGGTA TCGTTCCCCG TCrGCACGTA TATCCCCGCC ACCATATCCC ATCACATGAC CGGCATAATG 

not CCCACGGATA tctcacctga cgatgcacaa AAACTTCTCG TTGGGCTCAA ccagcgaatc gtcattaacg GTAAGACTAA caggaacacc aataccatgc 
i:oi AAAATTACCTTCTGCCAATC attccacaag cgttcagcaa atgggccaag gagcgcaaag aagaccttga caatgaaaaa atgctcggta ccagagagcg 
1301 caagcttaca tatccctcct tgtgggcgtt tcgcactaag aaagtgcact cgttctatcg cccacctgga acgcagacca tcgtaaaagt cccagcctct 

1*01 TTTAGCCCTT TCCCCATGTC atccctatgg actacctctt TGCCCATCTC GCTGACGCAG AAGaTAAAAT TGGCATTACA ACCAAAGAAG GaCGAAAAAC 
1301 TGCTGCAAGT CCCGGAGGAA TTAGTCATGG AGGCCAAGGC 7GC7TTCGAG GATCCTCAGG AGGAATCCAG AGCGGAGAAC CTCCGAGAAG cactcccacc 
1601 ATTAGTCGCA GACAAAGCTA TCGAGGCAGC CCCCGAAGTT GTCTGCGAAG TGGACGGGCT CCAGGCGGAC ATCGGACCAG CACTCG7CGA AACCCCGCCC 
1701 CGTCATGTAA GGATAATACC ACAAGCAAAT gaccgtatga tccgacagta CATCGTTGTC TCGCCAACCT CTGTGCTGAA GAACGCTAAA ctcgcaccag 
ISO I CACACCCCCT agcagaccag gttaagatca taacgcactc cgcaagatca ggaaggtatc cactcgaacc atacgaccct aaagtactga tcccagcagg 

1901 AAGTGCCGTA CCaTGGCCAG AATTCTTAGC ACTCAGTGAG AGCGCCACGC TAGTGTACAA CGAAAGaGAG TTTGTGAACC GCAAGCTGTA CCATATTCCC 

:ooi atgcacggtc ccgctaagaa tacagaagag gagcagtaca aggttacaaa ggcagagctc GCAGAAACAG agtacgtgtt tgacgtggac aagaagccat 

2 IOI GCGTCAAGAA GGAAGAAGCC TCAGGACTTG TCCTCTCCGG AGAACTGACC AACCCGCCCT ATCACGAACT AGCTCTTGAG GGACTGAAGA CTCGACCCGT 
2201 GGTCCCGTAC AAGGTTGAAA CAATAGGACT GATaGGCGCA CCACGATCCG gcaagtcggc tatcatcaag tcaactgtca CGGCACGTGA tcttcttacc 
2301 agcggaaaga aagaaaactg CCGCGAAATT CAGGCCCATG tgctacccct gaccgccatg cacatcacgt cgaagacagt cgattcggtt atgctcaacg 
:<ot gatgccgcaa agccgtagaa gtgctgtatg ttgacgaagc gttcgcgtgc cacgcaggag cactacttcc cttgattgca atcgtcagac cccgtcataa 
1301 ggtagtgcta tccggagacc ctaagcaatg cggattcttc aacatgatcc aactaaaggt atatttcaac cacccggaaa aagacatatg taccaagaca 
:soi TTCTACAAGTTTATCTCCCG acgttccaca cagccagtca ccgctattgt atccacactc cattacgatg gaaaaatgaa AACCACAAAC ccgtgcaaga 
2701 agaacatcga aatcgacatt acagggccca ccaagcccaa gccaggggac atcatcctga catccttccg cgcgtccgtt AAGCAACTGC AAATCGACTA 

ISO! TCCCGGACAT GACGTAATGA CAGCCCCGGC CTCACAAGGG CTAACCAGAA AAGCAGTATA TCCCGTCCGG CAAAAAGTCA ATGAAAACCC gctgtacccg 
2901 ATCACATCAG AGCATGTGAA CGTCCTGCTC ACCCGCACTG AGGACAGCCT AGTaTGGAAA aCTTTaCaGG GCGaCCCATG CaTTAACCAC CTCACTAACG 
3001 TACCAAAAGG AAATTTTCAA GCCACCATCG AGGACTGGGA AGCTGAACAC AAGGGAATAA TTGCTCCGaT AAACAGTCCC GCTCCCCGTA CCAATCCGTT 
3101 CAGCTGCAAG ACTAACGTTT GCTGCCCGAA ACCACTCGAA CCGATACTGG CCACGGCCGG TATCGTACTT ACCCGTTGCC AGTGGaGCGA CCTGTTCCCA 
3201 CAGTTTCCAG ATGaCAAACC acactccccc atctaccccc TCGACGTAAT CTGCATTAAG tttttccgca tggacttgac AACCGGACTG ttttccaaac 
3301 AGAGCATCCC GTTAACGTAC CATCCTGCCG ATTCACCCAG GCCAGTACCT CATTCGGACA ACAGCCCAGG AACCCCCAAG TATGCC7ACG ATCACGCCGT 
3*01 TGCCCCCCAA CTCTCCCGTA GATTTCCGCT GTTCCAGCTA GCTCGGAAAG CCaCaCaGCT TGATTTCCaG ACGGGCAGAA CTAGAGTTAT CTCCCCACAG 
3301 CATAACTTGG tcccagtgaa CCGCAATCTC CCCCACGCCT TACTCCCCGA GCACAAGGAG AAACAACCCC GCCCGGTCAA AAAATTCTTG AGCCACTTCA 
3601 AACACCACTC CGTACTTGTG GTCTCAGACC AAAAAATTGA ACCTCCCCAC AAGAGAATCG AATGGATCCC CCCGATTGGC ATAGCCGGCG CTGaTAAGAA 

3701 CTACAACCTG gctttcgggt ttccgccgca ggcacggtac cacctggtgt ttatcaatat tgg aactaaa TACAGAAACC atcactttca gcagtccgaa 
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3801 CACCATCCCC CGACCTTGAA AACCCTCTCC CC7TTCCGCCC TCAACTCCCT TAACCCCCCA GGCACCCTCC TCGTGAAGTC CTACCCTTAC CCCCACCCCA 
S90I ATACTGACCA CCTACTCACC CCTCTTGCCA GAAAATTTCT CAGAGTCTCT CCAGCCAGCC CAGaGTCCGT CTCAAGCAAT ACAGAAATGT ACCTCATCTT 
400t CCGACAACTA GACAACAGCC GCACACCACA ATTCACCCCC CATCATCTGA ATTGTGTGAT TTCGTCCGTG TACGAGGCTA CAAGAGACGG ACTTCGAGCC 
4101 GCACCCTCAT ACCCCACTAA AAGGGAGAAC ATTCCTGATT GTCAAGAGGA AGCACTTGTC AATGCaGCCA ATCCGCTCCG CACACCACGC GAAGGACTCT 
*1QI CCCGTCCCAT ctataaacgt tggcccaaca gtttcaccga TTCAGCCACA GAGACCGGCA CCGCAAAACT GACTGTCTCC CAAGGAAAGA AAGTGATCCA 
4301 CGCGGTTGGC CCTGATTTCC GGAAACACCC AGAGGCAGAA GCCCTGAAAT TGCTGCAAAA CGCCTACCAT GCAGTGGCAG ACTTAGTAAA TGAACATAAT 
■UOI ATCAAGTCTG TCGCCATCCC ACTGCTATCT ACaGGCATTT ACGCAGCCGG AAAAGACCGC CTTGAACTAT CACTTAACTG CTTGACAACC CCGCTAGATA 
4501 GAACTGATGC cgacgtaacc ATCTACTCCC TGGATAAGAA GTGGAAGGAA AGAATCGACG CGGTGCTCCA ACTTAAGGAG TCTCTAATAG ACCTGAAGGA 
4601 TGAGGATATG CAGATCGACG ACGAGTTAGT ATGGATCCAT CCGGACAGTT GCCTGAAGGG AAGAAAGGGA TTCAGTACTA CAAAAGGAAA GTTCTATTCG 
4701 TACTTTGAAG ccaccaaatt ccatcaagca gcaaaagata TCGCGGAGAT AAAGGTCCTG ttcccaaatc ACCAGGAAAG CAACGAGCAA CTGTGTGCCT 
4801 ACATATTGGG GGAGACCATG GAAGCAATCC GCGAAAAATG CCCCGTCGAC CACAACCCCT CGTCTAGCCC GCCAAAAACG CTCCCGTGCC TCTGCATGTA 
4901 TGCCATGACG CCAGAAAGGG TCCACAGACT CAGAAGCAAC AACGTCAAAG AAGTTACAGT ATGCTCCTCC ACCCCCCTTC CAAAGTACAA AATCAAGAAC 
5001 GTTCAGAAGG TTCAGTCCAC AAAAGTAGTC ctgtttaacc cgcatacccc TCCATTCGTT ccccccccta AGTACATAGA AGCGCCAGAA CaGCCTCCAG 
5101 CTCCGCCTCC ACaGCCCGAG GAGGCCCCCG AAGTTCCACC AACACCAACA CCACCTGCAG CTGATAACAC CTCCCTTGAT GTCACGGACA TCTCACTGGA 
J20I CATC G AAG AC AGTAGCGAAG gctcactctt TTCGAGCTTT AGCGGATCGG ACAACTCTAT TACTAGTATG gacagttggt cgtcaggacc tagttcacta 

5301 GAGATAGTAG accgaaggca ggtcgtggtg gctgacgtcc atgccgtcca agagcctgcc cctgttccac cgccaacgct aaagaagatg gcccgcctcg 

5401 CAGCGGCAAG aatgcaggaa gagccaactc caccggcaag caccagctct gcggacgagt cccttcacct ttcttttggt ggggtatcca tctccttcgg 

5501 ATCCCTTTTC cacggagaga tgggcccctt cccagcggca caacccccgg CAAGTACATG ccctacggat gtgcctatgt ctttcggatc gttttcccac 

5601 ggagagattg acgagctgag ccgcagagta accgagtctg agcccctcct gtttcggtca tttgaacccg gcgaagtgaa ctcaattata tcctcccgat 

5701 CAGTTGTATC TTTTCCACCA CGCAAGCaGA GACGTaGACG CAGGAGCAGG AGGACCGAAT ACTGACTAAC CGGGGTAGGT GGGTACATAT 1 T TCGACGGA 
5801 CACAGGCCCT GGGCACTTGC AAATGGaGTC CGTTCTCCAG AATCAGCTTA CAGAACCGAC CTTCGAGCCC AATGTTCTCG AAAGAATCTA CGCCCCGGTG 
5901 CTCGACACGT CGAAAGAGGA ACaCCTCAAA CTCaCGTACC AGATGATCCC CACCGAACCC AACaAAAGCA CGTACCACTC TaGAAAAGTA GAAAATCAGA 
6001 AAGCCATAAC CACTGAGCGA CTCCTTTCaC GCCTaCCACT GTATAACTCT GCCaCAGaTC AGCCaGAATG CTATAAGATC ACCTACCCGA AACCATCGTA 
6101 TTCCAGCAGT GTACCGGCGA ACTACTCTGA CCCAAAGTTT CCTGTAGCTG TTTGCAACAA CTATCTGCAT GAGAATTACC CGACGGTAGC ATCTTATCAG 
6201 ATCACCGACG AGTACGATCC TTACTTGGAT ATGGTAGACG GGACAGTCGC TTCCCTAGAT ACTGCAACTT TTTGCCCCGC CAAGCTTAGA AGTTACCCGA 
6301 AAAGACACGA GTATAGAGCC CCAAACACTC GCAGTGCGGT TCCATCAGCG ATGCAGAACA CGTTGCAAAA CCTGCTCATT CCCGCGACTA AAAGAAACTG 
6401 CAACGTCACA CAAATCCGTG AATTGCCAAC ACTGGACTCA GCGACATTCA ACGTTGAATG CTTTCGAAAA TATGCATGTA ATGACGAGTA TTGGGAGGAG 
6501 TTTGCCCGAA AGCCAATTAG GATCACTACT GaGTTCGTTA CCCCaTACGT GGCCAGaCTG AAAGCCCCTA AGGCCGCCGC ACTGTTCGCA AAGACGCATA 
6601 ATTTCCTCCC ATTGCAAGAA G7CCCTATCG ATAGGTTCGT CATGGACATG AAAAGAGACG TCAAAGTTAC ACCTCGCACG AAACACACAG AAGAAAGACC 
6701 CAAAGTACAA GTGCTACAAG CCCCAGAACC CCTCGCGACC CCTTACCTGT GCCCGATCCA CCCGGAGTTA CTGCCCaGGC TTACAGCCGT CTTCCTaCCC 
6S01 AACATTCACA CGCTTTTTGA catgtccgcg cacgactttg ATGCAATCAT AGCAGAACAC ttcaagcaag ctgaccccct actggagacg gatatcgcct 
6901 CGTTCGACAA aagccaagac gaccctatcc ccttaactcg CCTGATGATC TTCGAAGACC tccctgtcga CCAACCACTA ctcgacttga TCGAGTGCGC 
7001 CTTTGCaGAA ATATCATCCA CCCATCTCCC CACGCGTACC CGTTTCA AAT TCGCGCCGAT GATGAAATCC GGAATGTTCC TCaCGCTCTT TGTCAACACA 
7101 GTTCTGAATG TCGTTATCGC CAGCAGAGTA TTGGAGCAGC GGCTTAAAAC GTCCAAATGT GCACCATTTA TCGGCGACGA CAACATCATA CACGGAGTAG 
7301 T ATCTG A C AA AGAAATGCCT GAGAGGTCTG CCACCTGCCT CAACATGGAG GTTAAGATCA TTGACGCACT CATCGCCGAG AGaCCGCCTT acttctgcgg 
7301 TGGATTCATC TTGCAAGATT CCGTTACCTC CACAGCCTGT CCCGTGGCGG ACCCCTTGAA AaCCCTGTTT AACTTCCGTA AACCGCTCCC AGCCGACGAC 
7401 GAGCAAGACG AAGACAGAAG ACGCCCTCTG CTAGaTGAAA CAAAGGCGTG GTTTAGAGTA CGTATAACAG ACACCTTACC AGTGGCCCTG CCAACTCGGT 
7501 ATGAGGTAGA caacatcaca CCTGTCCTGC TGGCATTGAG AACTTTTCCC CAGAGCAAAA GAGCATTTCA AGCCATCAGA GGCGAAATAA accatctcta 
7601 CCGTCGTCCT AAATAGTCAG CATACCACAT TTCATCTGAC TAATACCACA ACACCACCAC CATGAATAGA GGATTCTTTA ACATGCTCGG cccccgcccc 
7701 TTCCCGCCCC CCACTCCCAT GTGGACGCCG CCCACAAGGA CCCAGGCGGC CCCCaTGCCT GCCCGCAATG GGCTCGCTTC CCAAATCCAG CAACTGACCA 
7801 CAGCCGTCAG tcccctactc ATTCCACAGG CAACTAGACC TCAAACCCCA cccccacgcc CCCCCCCGCG CCACAAGAAG CAGGCCCCAA ACCAACCACC 
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TBI GAAGCCGAAG AAACCAAAAA CACACCACAA CAAGAACAAC CAACCTCCAA AACCCAAACC CCCAAACACA CAACGTATGG CaCTCAACTT CCACCCCCAC 
300 1 ACACTCTTCC ACCTCAAAAA TCAGGACCGA CATGTCATCC CCCACCCACT CGCCATCCAA GGAAAGGTAA TCAAACCACT CCACGTCAAA GCAACTATTC 
8 101 ACCACCCTCT CCTATCAAAC CTCAAATTCA CCAACTCCTC AGCATACCAC atccagttcc cacacttccc cctcaacatc acaactcacc ccttcaccta 
no: caccacccaa caccctgaao gcttttacaa ctcccaccac ggagcggtgc actatactcc acgtacattt accatccccc cccgactagc aggcagagga 

UOt GACAGTCCTC GTCCCATTAT GGaTAACTCA CCCCCGGTTG TCCCGaTAGT CCTCCGAGGG CCTGATGAGG GAACAAGAAC TGCCCTTTCG GTCGTCACCT 
8*01 GGAATAGCAA AGGGAAGACA ATCAAGACAA CCCCGGAAGG GACAGAAGAG TGGTCTGCAG CACCACTGGT CACGGCCATG TGCTTGCTTG GAAACGTCAG 
1501 CTTCCCATGC AATCGCCCGC CCACATGCTA CACCCGCGAA CCATCCAGAG CTCTTGACaT CCTTGAAGAG AACGTGAACC ACGAGGCCTA CGACACCCTG 
1601 CTCAACGCCA TATTGCGGTG CGGATCCTCC CGCAGAAGCA AAAGAAGCCT CACTGACGAC TTTACCTTCA CCAGCCCCTA CTTGGGCACA TGCTCGTACT 

troi gtcaccatac tgaaccgtgc tttagcccga TTAAGATCCA GCAGGTCTCG gatgaagcgg accacaacac catacgcata cagacttccg cccagtttcc 
i!Ol ataccaccaa acccgagcag caagctcaaa taagtaccgc tacatgtcgc tcgagcagga tcataccctc aaagaaggca ctatcgatga catcaagatc 
S9Q1 AGCACCTCAG gaccgtgtag aagccttagc tacaaaggat actttctcct cgcgaagtgt cctccagggg acagcgtaac ggttagtata gcgagtacca 
9001 ACTCACCAAC gtcatgcaca ATGGCCCGCA ACATAAAACC AAAATTCGTG ggacgcgaaa AATATGACCT acctcccgtt caccgtaaga agattccttg 
9101 CACAGTGTAC GACCGTCTGA AaGAAACAAC CGCCGGCTAC ATCACTATGC ACAGCCCCGG ACCGCACGCC TATACGTCCT ATCTGGAGGA ATCATCAGGG 
9201 AAAGTCTACG CGAAGCCACC ATCCCCAAAG aacattacgt ACGAGTGCAA gtccggcgat tacaagaccg gtaccgttac gacccgtacc gaaatcaccg 
9301 cctgcaccgc catcaagcag tgcgtcgcct ataagagcga ccaaacgaag tgggtcttca attcgccgga cttgatcaga catcccgacc acacggccca 
9401 AGCGAAATTG catttacctttcaagctgat ccccagtacctgcatcgtcc ctgttgccca cgcgccgaac gtagtacacg cctttaaaca catcagcctc 

9501 CAATTAGACA CAGACCACCT CACATTCCTC ACCACCAGGA GaCTAGGCCC AAATCCGGAA CCAACTACTG AATGGATCaT CGGAAAGACG GTTAGAAACT 

9601 TCACCGTCGA CCCACATCGC CTCGAATACA TATCCGCCAA TCaCCAACCG GTAACGGTCT ATCCCCAAGA GTCTGCACCA GGaGACCCTC ACGGATCGCC 

7701 ACACGAAATA GTACAGCATT ACTACCATCG ccatcctgtg tacaccatct tagccgtcgc ATCAGCTGCT gtggcgatga tcattggcgt AACTGTTGCA 

9801 CCATTATGTC cctgtaaacc gccccgtgag tccctgaccc catatgccct ccccccaaat gccgtgattc caacttccct cccacttttg tcctgtctta 

9901 GGTCGGCTAA TCCTGAAACA TTCACCCAGA CCATGAGTTA CCTATGGTCG AACAGCCAGC CATTCTTCTG GGTCCAGCTG TGTATACCCC TGGCCGCTGT 

lOOOt CATCGTTCTA atgcgctgtt cctcatcctc cctccctttt ttagtcgttg CCCGCGCCTA cctggcgaac gtagacgcct ACGAACATGC caccactctt 

10101 CCAAATGTGC CACAGATACC GTATAAGGCA CTTGTTGAAA GGGCAGGGTA CGCCCCGCTC AATTTGGAGA TTACTGTCAT GTCCTCGGAG GTTTTCCCTT 
10201 CCACCAACCA AGAGTACATC ACCTCCAAAT TCACCACTGT CGTCCCCTCC CCTAAAGTCA AATCCTCCGG CTCCTTGCAA TGTCAGCCCG cccctcaccc 
10301 AGaCTATACC TGCAAGGTCT TTGGAGGGGT GTACCCCTTC ATGTGGGGAG GAGCACAATG 1 1 1 UGCGAC AGTGAGAACA gccagatgag TGAGGCGTAC 
10401 CTCCAATTCT CAGCACATTG CCCGACTGAC CACGCGCAGG CGATTAAGGT gcatactccc GCGATGAAAG tacgactacg TATAGTCTAC CGGAACACTA 
10501 CCAGTTTCCT AGATGTGTAC GTGAACGGAG TCACACCAGG AACGTCTAAA GACCTGAAAG TCATaGCTCG ACCAATTTCA GCATCCTTTA CACCATTCGA 
10601 TCACAAGGTC gttatccatc CCCGCCTGGT CTACAACTAT GACTTCCCGG AATACGGAGC CATCAAACCA ggagcctttg GAGACATTCA agctacctcc 
10701 TTGACTAGCA AACATCTCAT cgccagcaca gacattagac tactcaaccc ttccgccaag aacgtccatg tcccgtacac gcaggcccca tctcgattcg 

10801 AGATG7CGAA AAACAACTCA GGCCCCCCAC TCCACGAAAC CGCCCCTTTC CGGTCCAAGA TTGCAGTCAA TCCGCTTCGA CCCGTCGACT CCTCATACGG 
10901 GAACATTCCC ATCTCTATCG ACATCCCGAA CGCTCCCTTT ATCAGGACAT CaGaTGCaCC ACTGGTCTCA aCAGTCAAAT GTGATGTCaG TGAGTCCACT 
1 1001 TACTCAGCCG acttcccccg catccctacc ctccagtatg TATCCCACCC CGAAGGACAA tcccctgtac ATTCGCATTC CACCACAGCA ACCCTCCAAG 
1 1 101 AGTCGACAGT TCATGTCCTG GaGAAAGGaG CGCTGACAGT ACACTTCaGC ACCGCGaGCC CACaGGCGAA cnTATTGTA TCGCTGTGTG GTAAGAAGAC 
M20I AACATCCAAT GCaGAATCCA AACCACCACC TGACCATATC GTCAGCACCC CCCACAAAAA TGACCAAGAA TTCCAAGCCG CCATCTCAAA AACTTCATGG 
1 1301 AGTTCCCTGT TTGCCCTTTT CGGCGGCGCC TCGTCCCTAT TAATTATAGG ACTTaTGATT TTTGCTTCCA GCATGATGCT CACTACCaCA CCAAGaTCAC 
1 1401 CGCTACCCCC CAATGACCCG ACCaGCAAAA CTCGATGTAC TTCCCAGGAA CTCATGTGCA TAATGCATCA CCCTGGTATA TTAGATCCCC GCTTACCCCG 
USOt GGCAATATAG CAACACCAAA ACTCGACGTA TTTCCCAGGA AGCCCAGTGC ATAATGCTGC GCAGTGTTGC CAAATAATCA CTATATTAAC CATTTATTTA 
1 1601 GCGGACGCCA AAACTCAATG tatttctgag GAAGCATGGT CCATAATGCC ATCCAGCCTC TCCATAACTT TTTATTATTT CTTTTATTAA TCAACAAAAT 
11701 TTTGTTTTTA ACATTTN 
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A* Amino Acid Sequence of the Nonstructural Polyprotein 

l MEKPWNVDV DPQSPFWQL QKSFPQFEW AQQVTPNDHA NARAFSKIA5 KUELEVPTT ATtLDIGSAP ARRMF3EHQY HCVCPMRSPE DPDRMMKYAS 

101 KLAOCAOCrT NKNLHEXKD LRTVLDTPDA ET7SLCFHND VTCNTRAEYS VWQDVYINAP GTnrHGAMKG VRTLYWTGFD TTQFMFSAMA CSYPAYNTNW 

201 ADEXVLEARN IGLCSTXLSE CRTCKL51MR KKEUCPCSRV YFSVGSTLYP EHRASLQSWH LPSVFHLXCK QSYTCRCOTV VSCEGYWKK rnSPGITCE 

301 TVCYAVTNNS EGFliOCVTD TVKCERVSF? VCTYIFAT1C OQMTCtMATD ISPDOAQK1X VGLNQRIVtN CJCTNRNTNTM QNYLLHIAQ GFSKWAX£RX 

401 EOLONEKMLC TRERKLTYCC LWAFRTKKVH SFYRPPCTQT IVXVPASFSA FPMSSVWTTS LPMSLRQKIX LALQPKXEEK LLQVPEELVM EAKAAFEDAQ 

SOI EESRAEXLRE ALPPLVADKO IEAAAEWCE VEGLQAOICA ALVETPRCHV RIIPQaNDRM IC<JYIW$PT SVUCNAKLAP AHPtADQVXI ITHSGRSGRY 

601 AVEPYOAKVL MPAGSAVPWP EFLALSESAT LVYNEREFVN RKLYKlAMHG PAXNTEEEQY XVTKAELAET EYVFDVDKKR CVKKEEASGL VLSCELTNPP 

701 YHELALXCLX TRPWPYKVE TICVtCAPGS CKSAIOCSTV TARDLVTSCK KENCREIQAO VULRCMQTT SKTVDSVMLN CCRXAVEVLY VDEAFACHAG 

MI ALLALIAIVR PRHKWLCCD PKQCGFFNMM QUCVYFNHPE KDICTKTFYX F1SRRCTQPV TAfVSTTHYD GKMKTTNPCK KN1EIDITCA TKPKPGDUL 

901 TCFRCWVKQL Q1DYPGHEVM TAAASQGLTR KGVYAVRQKV NENPLYATO EHVNVU.TRT EDRLVWKTLQ CDPWKQLTN VPKCNFQATI EOWEAEHKCI 

1 001 TAAINSPAPR TNPFSCKTNV CWAXRLEPIL ATAGIVLTGC QWSEUTQFA DOKPHSAfYA LDVJCIXFFG MDLTSGLFSK QS1PLTYHPA DSARPVAHWD 

(101 NSPGTRXYGY DHAVAAELSR RFFVFQLAGK GTQLDLQTGR TRV1SAQHNL VPVNRNLPHA LVPEKKEKQP GPVKKFLSQF KHHSVLWSE EKIEAPWCRI 

1201 EWIAPICIAG ADKNYNl-AFG FPPQARYDLV FtNIGTKYRN HHFQQCEOKA ATLXTLSR-SA LNCLNPCCTL WKSYGYAOR NSEDWTALA RXFVRVSAAR 

1301 PECVSSHTEM YUFRQLONS RTRQFTPKHL NCVtSSYYEG TRDGYGAAPS YRTXRENIAD CQEEAVVNAA NPLGRPGEGV CRAJYXRWPN SFTDSATETC 

1*01 TAXLTVCQGK KVIHAVCPDF RKHPEAEALX LLQNAYHAVA DLVNEHNDCS VAIPLLSTCI YAAGKDRLEV SLNCLTTALO RTDAOVTTYC LDJCKWKERID 

1501 AVLQUCESV! EUCDEDMEID OELVWIHPDS CUCGRXGFST TKGKLYSYFE GTXFWQAAJCD MAEIKVLFFN OQESNEQLCA YILGETMEA1 REKCPVDHNP 

1601 SSSPPKTLPC LCMYAMTPER VHRLRSNNVK EVTVCSSTTL PXYXIKNVQK VQCTXWLFN PHTPAFVPAR KYlEAfEQPA APPAQAEEAP EVAATPTPPA 

1701 ADNTSLDVTD ISLDMEDSSE GSLFSSFSGS DNSfT SMDS W SSGPSSLEIV DRRQWVAOV HAVQEPAPVP PPRUCKMAftL AAARMQEEFT FPASTSSADE 

I«0I SLHLSFGGVS MSFGSLFDGE MCALAAAQPP A5TCFTDVPM SFGSF5DGEI EELSRRVTES EPVLFGSFEP GEVNSIISSR SWSFPPRXQ RRRRRSRRTE 

1901 Y 



Amino Acid Sequence of the Structural Polyprotein 

l MNRGFFNMLG RRPFPAPTAM WRPRRRRQAA PMPARNGLAS QEQQLTTAVS ALVIGQATRP GTPRPRPPPR QKKQAPKQPP KPKKPXTQEK KXKQPAXPKP 

101 GKRQRMALKL EADRLFDVKN EDGOVTGHAL AMEGKVMKPL HVKGTTDHPV LSKLKFTXSS AYDMEFAQLP VNMRSEAFTY TSEHPEGFYM WHHGAVQYSG 

201 GRFTTPRGVG GRGOSGRPIM DNSGRWACV LGGADEGTRT ALSVVTWNSJC GKTIKTTTEC TEEWSAAPLV TAMCU.GNVS FPCNRPPTCY TREPSRALD! 

301 LEENVNHEAY DTLLNA1LRC GSSGRSKRSV TDDFTLTSPY LGTC3YCHHT EPCFSPIKIE QVWOEADDNT 1R1QTSAQFG YDQSGAASSN KYRYMSLEQD 

401 HTV XEGTM DD (XISTSCPCR RLSYXGYFLL AKCPPGOSVT VSIAS5NSAT SCTMARXIKP KFVGREKYDL PPVHGKKIPC TVYDRUCETT AGYTTMHRPG 

Ml PHAYTSYLEE SSGKVYAXPP SGKNTTYECK CGDYXTGTVT TRTETTCCTA IKQCVAYXSD QTXWVFNSPD LIRHADHTAQ GKLHLPFKU PSTCMVPVAH 

601 APNWHGFKH ISLQLDTDHL TLLTTRRLGA NPEPTTEWn CKTVRNFTVD RDGLEYIWGN HEPVRVYAQE SAPGOPHGWP HEIVQHYYWR HPVYTTLAVA 

701 SAAVAMMIGV TV AA LCACK A RRECLTPYAL APNAV1PTSL ALLCCVRSAN AETFTETMSY LWSNSQPFFW VQLC1PLAAV rVLMRCCSCC LPFLWAGAY 

Ml LAKVOAYEHA TTVPNVPQIP YKALVERAGY APLNLEITVM SSEVLPSTNQ EYrTCKFTTV VPSPKVKCCG SLECQPAAHA DYTCXVFGGV YPFMWGGAQC 

901 FCDSEKSQMS EAYVELSAOC ATDHAQAtKV KTAAMKVGUt rVYCNTTSFL DVYVNGVTPG TSKDLKVtAG P1SASFTPFD HKW1HRCLV YNYOF7EYGA 

1001 MKPGAFCDtq ATSLTSXDL! ASTDIRL1JCP SAKNVHVPYT QAASGFEMWK NNSGRPLQET APFGCK1AVN PLRAVDCSYG N1PISJD1PN AAFTRT3DAP«- 

1101 LVSTVKCDVS ECTYSADFGG MATLQYVSDR EGQCPVHSHS STATLQESTV HVLEXGAVTV HFSTA5PQAN FTVSLCGKKT TCNAECXPPA DHTVSTPHKN 

1201 DQEFQAA1SK TSWSWLFALF CGASSLUIC LMIFACSMML TSTRR 
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Nucleotide Sequence of S55 

I ATTCCCCCCC TaCTaCaCaC TaTTCaaTCa aaCaGCCCaC CAATTCCaCT ACCATCACAA TGCaGaaGCC aCTACTTAAC CTACaCCTaC accctcacag tccctttctc CTCCAACTGC 
1 21 AAaaGaCCTT CCCCCAATTT CAGCTaCTaG CaCaGCaCCT CaCTCCAAaT CACCaTGCTA ATCCCACACC ATTTTCGCAT ctcgccacta aactcatcga gctgcagctt CCTACCaCAG 
1*1 CCACCATTTT CCACATACCC AGCCCACCCC CTCCTACAAT CTTTTCCCAC CACCACTACC ATTCCCTTTC CCCCATCCCT AGTCCACAaG aCCCGCACCG CATCATCAAA TATCCCAGCa 
161 AACTCCCGGA aaaaGCATGT AACaTTaCAA aCaaCAACTT GCaTGaGAAG aTCAAGGACC TCCGGACCCT aCTTGATACA CCCGATCCTG AaaCCCCATC ACTCtLCl K CACAACCATQ 

a i ttacctgcaa cacgcgtccc gagtactccg tcatgcagca cctctacaTC aacgctcccg caactattta ccaccaggct atgaaaggcg tgcggaccct ctactgcatt gccttccaca 

601 CCACCCAGTT CaTCTTCTCC CCTaTCGCaG CTTCCTACCC TGCATACAAC ACCAACTDGG CCCACGaaaa aCTCCTTCAA GCCCGTAaCa TCCCACTCTG CAGCACAAAG CTCACTCAAG 
711 GCAGGACAGG aaaGTTCTCG ATAATGaGGA aGaaGGACTT GAaGCCCGGG TCACGGCTTT ATTTCTCCCT TGGATCCACA CTTTACCCaG AACACACAGC CAGCTTGCAG ACCTGGCATC 
Ml TTCCATCGGT CTTCCACTTG AAAGGAAAGC AGTCGTACaC TTGCCGCTGT GaTaCaGTCC TGaGCTCCGA AGCCTACGTA gtcaagaaaa TCACCATCAG TCCCGGGaTC ACGGGaCAAA 
961 CCGTGGGaTA CGCGGTTACA AACAATAGCG AGGGCTTCTT CCTATGCAAA gttaccgata CaGTAAAAGG AGAACGGGTA tcgttccccg tgtccacgta TATCCCGCCC ACCATATGCC 
IUI ATCACATCaC cggcataatg gccacggata TCTCACCTCA CGATGCaCAA AAACTTCTGG TTGGGCTCAA CCAGCGAATC CTCATTAACG GTAaGACTAA CAGGAACACC AATACCATGC 
1201 AAAATTACCT TCTGCCAATC ATTGCACAaG CCTTCAGCAA ATGGGCCAAG GAGCCCAAAG AAGATCTTCA CAATGAAAAA aTGCTGCGCA CCAGAGACCG caagcttaca tatggctgct 
1321 tgtgggcgtt TCGCACTAAG AaaGTCCACT CGTTCTATCG CCCACC7GCA ACGCAGACCA TCCTAAAaGT CCCACCCTCT tttaccgctt TCCCCATCTC ATCCCTATGG ACTaCCTCTT 
1*41 TCCCCATCTC GCTGAGGCAG AAGaTGAAAT TGCCATTACA ACCAAAGAAG GaGGAAAAAC TGCTGCAAGT CCCGCAGGAA TTACTTATGG AGGCCAAGGC tgctttcgac GATGCTCAGG 
1361 AGGAATCCAG AGCGGaGAaG CTCCCaCAAG CaCTCCCaCC ATTAGTGGCA CACAAAGGTA TCGAGGCAGC TGCGGAAGTT GTCT G CGAAG tggagcggct CCAGGCCCAC ACCGCAGCAC 
16*1 CACTCGTCGA AACCCCGCGC GCTCATGTAA GGaTAATaCC TCAAGCAAAT GACCGTATGA TCGCACAGTA TaTCGTTGTC TCGCCCATCT CTGTGCTCaa GAACGCTAAA CTCGCACCAG 
1901 CACACCCCCT agcagaccac gttaagatca taacgcactc CGGaagatca ggaaggtatg cagtcgaacc ATACGACGCT aaagtactca tgccagcagg AAGTGCCGTA CCATGGCCAG 
\ni aa i . _ i i agc actgagtgag agcgccacgc ttctgtacaa cgaaagagag tttgtgaacc gcaaCctgta ccatattocc atccacggtc ccgctaagaa tacacaagag cagcagtaca 
:04l agcttaCaaa ggcagagctc gcagaaaCag agtacgtgtt tcacgtggac AAGAaGCCaT gcgttaagaa ggaagaagcc tcaggacttg tcctttcggg acaactcacc aacccgccct 
iiftt atcaccaact agctcttgag ggactgaaca ctccaccccc gctcccctac aaggttgaaa caataggagt gatacccaCa ccaGgatccg gcaagtcagc tatcatcaag tcaactgtca 
=il CCGCACCTGA tcttgttacc accgcaaaga aagaaaactg ccgcgaaatt caggccgacc tgctaccgct caggggcatc CAGATCACCT CCaaCaCAGT gcattccgtt atgctcaacg 
j*oi catgccacaa agccctacaa ctgctgtatg ttgaccaagc cttccggtgc cacgcaggag cactacttcc cttgattoca atcgtcacac cccgtaagaa gctagtacta tgcgcagacc 
isz( ctaagcaatg cgcattcttc aacatgatgc aactaaaggt acatttcaac caccctcaaa aagacatatg taccaaGaca ttctaCaaCT ttatctcccg acgttgcaca cagccagtca 

CCCCTATTCT ATCCaCaCTG CaTTaCGaTG GaaaaaTGaa aaCCaCaaaC CCCTCCAAGa aGaaCaTCCa aatccacatt acaggggcca cgaagccgaa gccaggggac atcatcctca 
rrsi catc 1 1 1 ccc cgcgtccgtt aagcaactgc aaatccacta tcccccacat cagctaatga cagccgcogc ctcacaaggc ctaaccagaa aaggagtata tgccgtcccg caaaaagtca 
iui atgaaaaccc cctctacgcc atcacatcag agcatctcaa cctcttgctc acccgcactc aggacaggct actatggaaa actttacagc gcgacccatg Cattaagcag ctcactaacc 
3001 TAC CTAAAGG AAA 1 1 [ I LAG gccaccatcc aggactggga agctcaacac AAGGGAATAA TTCCTGCGAT AaaCaGTCCC cctccccgta ccaatccgtt cagctccaag actaacgttt 
3111 CCTCCCCCAA ACCACTCCAA CCGaTaCTCC ccacgcccgg tatcctactt accccttgcc actgcagcga cctcttccca cactttcccg ATGACAAACC acactcggcc atctacccct 
n*i tacacctaat ttgcattaag tttttccgca tccacttcac aaccggcctc ttttccaaac agagcatccc cttaacctac catcctcccc actcagccag cccagtagct cattgccaca 
3341 acagcccagg aacacgcaag tatccctacc atcacgccct tcccgcccaa ctctccccta cattttcggt cttccagcta cctcccaaag ccacacagct tgatttgcag accgccacaa 
wi ctagagttat ctctgcacag cataacttgg tcccactcaa ccgcaatctc cctcaccccttactccccca gcacaaccag aaacaacccc ccccogtcca aaaattctt g acccacttca 
3601 AACACCACTC CCTACTTCTC aTCTCACaCA aaaaaattca agctccccac aacacaatcg aatccatccc ccccattcgc atagcccccc cacataacaa ctacaacctg gctttccgct 
3721 TTCCCCCGCA CCCaCCCTAC CaCCTGCTGT TCATCAATaT TCCAACTAAA tacagaaacc ATCACTTTCA aCaCTCCCAA CaCCaCCCCG CCACCTTCAA AaCCCTTTCG ccttcccccc 
3*41 TCAACTGCCT TAACCCCCCA gccaccctcc TGCTCAACTC ctacccttac gcccaccgca ATACTCACCA CCTACTCACC CCTCTTGCCA CAAAATTTCT CaCACTCTCT GCAGCCaCCC 
3941 CACAGTCCCT ctcaagcaat acacaaatct acctcatttt cccacaacta cacaacagcc ccacaccaca attcaccccg catcatttca ATTCTCTCAT ttcctccctc TaCGaCGCTa 
««i caacagaccg acttgcagcc gcaccctcct accgtactaa aacccacaac attcctcatt ctcaacacca accacttctc aatccagcca atccactcgc cacaccagga gaagcactct 

«201 GCCCTCCCAT CTATAAACCT TGGCCGAaCA CTTTCaCCCA TTCAGCCACA GACaCAGCTA CCSCAAAaCT CACTCTGTCC CAACGaaaCa aactcatcca cgcgcttggc cctcatttcc 

mi ccaaacaccc acaggcagaa gccctcaaat tgctgcaaaa cccctaccat gcactcccag acttagtaaa tcaacataat atcaactctc tccccatccc actcctatct acacccattt 
acgcagccgg aaaacaccgc cttcacgtat cacttaactc cttcacaacc gcgctagaca gaactcatgc cgacgtaacc atctactccc tccataacaa ctccaagcaa acaatcgacg 
*S*I CGGTGCTCCA ACTTAAGGAG tctctaactg AGCTGAACCA tcagcatatg gacatcgacg accagttact atggatccat cccgacactt gcctgaaggg aagaaaccca ttcagtacta 
*6II caaaaggaaa gttgtattcg tactttcaag gcaccaaatt ccatcaagca ccaaaagata tgccgcacat aaaggtcctc ttcccaaatg accaggaaag caacgaacaa ctctgtccct 
*»i acatattcgc gcacaccatg caagcaatcc cccaaaaatg cccgctccac cacaacccct cctctagccc gccaaaaacc ctcccctccc tctctatcta tcccatcacg ccagaaaggg 
tccacagact cagaagcaat aacctcaaag aagttacagt atgctcctcc accccccttc caaagtacaa aatcaacaat cttcacaagg ttcagtccac aaaagtagtc ctctttaacc 
xxi cccatacccc cgcattcctt ccccccccta actacataca aGCACCaCAA cagcctgcac ctccccctcc acacccccac gacgcccccc cacttgtagc cacaccaaCa ccacctgcag 
S16I CTCATAACAC ctcccttgat ctcacgcaca TCTCACTGGA CATGGAACAC 'agtagcgaag gctcactctt ttccagcttt agcggatcgg ACAACTACCG AAGGCaCCTG ctcctggctg 
J2JI acgtccatgc cgtccaacag cctgcccctg ttccaccccc aaggctaaag aagatggccc gcctcccagc ggcaacaatg caggaagagc caactccacc gccaagcacc agctctccgg 

J-Ol ACGAGTCCCT TCaCUMLI TTTGATGGGG TaTCTATaTC CTTCGCaTCC CTTTTCCACC GaCaGaTGGC CCOCTTGCCA CCGGCACAAC CCCCGGCAAG tacatcccct acccatgtcc 

snt ctatctcttt cggatcgttt tcccacggag agattgacca gttgagccgc agagtaaccc agtcggaccc cctcctcttt gggtcatttg aacccggcca agtgaactca ATTATATCCT 
***' CCCCATCAGC CGTATCTTTT CCACCACGCA aGCaGaGaCG TAGACGCAGG AGCAGGaGGA cccaatactg tctaaccccc ctacctccgt aCaTATTTTC cacggacaca ggccctcggc 
mi acttccaaaa caactccctt ctgcagaacc accttacaga accgaccttg CaCCCCaaTG TTCT CC aaaG aatctacccc cccctgctcc acacctccaa acaggaacac ctcaaactca 
SUI cctaccagat catgcccacc caagccaaca aaagcaccta ccagtctcca AAaCTAGaaa aCCaGAAACC cataaccact cagccactgc tttcagcgct acggctctat aactctccca 
«oi cagatcagcc acaatcctat AaGaTCACCT accccaaacc atcgtattcc accagtgtac cacccaacta CTCTCACCCA aaGTTTGCTG TACCTCTTTC TAACAaCTAT ctccatgaca 
6121 ATTACCCCAC CCTACCATCT TATCaCaTCA CCCaCGACTa CCaTGCTTAC TTCCaTaTCG TaGaCGGGAC aCTCCCTTCC CTaGATaCTC CaaCTTTTTC CCCCGCCAAC CTTaCaaCTT 
6241 ACCCCAAAAC ACACCACTAT ACaCCCCCAA ACATCCCCAG TCCCCTTCCA TCAGCCATGC ACAACACCTT GCAAAACGTG CTCATTCCCC CCaCTAAAAG AAACTGCAAC ctcacacaaa 
6361 tgcctcaact CCCAACACTG GACTCaGCGA CATTCAACCT TCAATGCTTT CGAAAATATG CATGCAATCA CCAGTATTCG GaCCAGTTTC CCCCAAAGCC AATTACCATC ACTACTCAGT 
*ai TCCTTACCCC ATACCTGCCC aCaCTGaaaG CCCCTaaGGC CCCCCCaCTG TTCGCAAaGa CGCATAATTT GGTCCCATTG CaaGaaGTCC CTATCCaTAG aTTCCTCATC CACaTCaaaa 
6601 CaCaCGTGAa agttacacct CGCACCaaaC acacacaaga AACACCGaaa gtacaagtca tacaacccgc agaacccctg GCCaCCGCTT aCCTATGCCC CATCCACCCC cacttactgc 
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4711 CCaGGCTTaC AGCCGTTTTG CTACCCaaCa TTCaCaCGCT CTTTCaCaTG TCCCCCGaGG aCTTTGaTGC AaTCATACCA CaaCACTTCa aGCaaGGTGa CCCCCTACTC CaCACCCata 
H TCCCCTCC7T CGaCAAAAGC CaaCaCCaCC CTATGCCCTT AACCGGCCTG ATGATCTTGC A^GaCCTCGC TGTGCaCCAA CCaCTACTCG ACTTCATCCA CTCCCCCTTT CGAGaaataT 

catccaccca tctccccacc ggtacccctt tcaaattcgg ggccatcatg aaatccggaa tcttcctcac ccTUTTTcmc aacacagttc tgaatgtcct tatcgccagc acactattcg 

TOtl ACCACCCCCT TAAAaCGTCC AAATCTGCAG CATTTATCCG CCACCaCAAC ATTATACACC CAGTaCTATC TCaCAAACAA ATCCCTCACA GCTCTCCCAC CTCGCTCAAC ATCCACCTTa 
TlOl ACATCATTCA CGCAGTCATC CGCCACACAC CACCTTACTT CTGCGGTGGA TTCATCTTCC AACATTCCCT TACCTCCACA OCCTCT C GCC TCCCCCaCCC CTTCAaaaGG ctctttaact 
mi TGOGTAAACC OCTCCCAOCC CACGaTCaCC aaGaCGAAGA CaGaaGaCGC CCTCTCCTAC aTGAAaCaaa CCCCTGCTrr ACACTACCTA TAACACa cac CTTACCACTC CCCGTOGCaa 
7**t CTCCGTATGA GCTaCACAAC ATCACACCTG tcctgctgcc ATTCaCAACT tttgcccaga GCaaaaGAGC atttcaagcc ATCaGAGGGG AAATaaaCCa tctctaccct cctcctaaat 
7561 ACTCAOCATA CTaCaTTTCA tctcactaat accacaacac caccaccatg aatacaccat tctttaacat cctccccccc cgccccttcc cagcccccac tcccatctcc accccoccca 
T«l CAACCACCCA occccccccc ATCCCTGCCC CCAATGGGCT GGCTTCCCAA ATCCAOCAAC TCACCACaCC cctcactgcc ctagtcattg GACAGCCAAC TACACCTCAA ACCCCACCCC 
7Mt cacccccccc cccgcgccag aagaaccacc ccccaaacca accacccaag cccaacaaac caaaaacaca gcagaacaag aagaagcaac ctccaaaacc caaaccccca aacacacacc 

7711 CTATCCCACT TAACTTCCAC CCCCACACAC TCTTCCACCT CAAAAATCAC caccgacatc tcatccggca cccactggcc ATCCAACCAA ACCTAATCAA ACCACTCCAC CTCAAACCAA 
iO*l CTATTCACCA CCCTCTGCTA TCAAAGCTCA aaTTCACCAA CTCCTCaGCA TACCACATCC ACTTCCCACA CTTCCCCCTC AACATCACaa CTCACCCGTT CACCTACACC AGTGAACaCC 
1161 CTCAACCCTT CTACAACTCC caccacccac cggtgcagta tactccaccc acatttacca tccccccccc actaccaccc acaccacaca ctgctcctcc CATTATCCAT AACTCaCCCC 

CSI CCCTTCTCCC CATACTCCTC CCACCCCCTC ATCaGOCAAC AaCaaCCCCC CTTTCCCTCC TCACCTCCAA TaCCAAACCC AaCaCAATCA aCaCAACCCC CGAACCCACA CAACACTCCT 

uoi ctcctgcacc actcctcacc cccATcrccr tccttggaaa cctcacctpc ccatccaatc ccccccccac atcctacacc ccccaaccat ccacacctct ccacatcctc caacacaacc 

1511 TCAACCACCA CCCCTACCAC ACCCTCCTCA ACCCCaTATT CCCCTCCCCA TCCTCCCOCA CAACTAAAaC AACCCTCaCT CACCaCTTTA CCTTCaCCAG CCCCTACTTC CCCACATCCT 
16*1 CCTACTCTCA CCATACTCAA ccgtgcttta cccccattaa GATCGAGCAG GTCTGGGaTG AACCCGACCa CAACaCCATA cgcatacaca CTTCCGCCCA C T TT C GATAC caccaaagcc 
i76i gagcaccaaC ctcaaataag taccgctaca tgtcgctcca ccaccatcat actctcaaag aaggcaccat ggatcacatc aagatcagca cctcaggacc gtct a gaagc cttagctaca 
Ull AAGGATACTT tctcctcgcg aagtgtcctc cacgccacag cgtaacggtt agcataccga ctagcaactc agcaacctca tccacaatgg CCCGCAAGAT AAAACCAAAA TTCGTGGGAC 
9001 GGGAAAAaTa TCaCCTaCCT CCCGTTCaCG GTAAGAAGAT TCCTTGCaCA GTCTACGaCC CTCTGaaaCa aaCaaCCGCC GGCTACATCA ctatgcacag gcccccaccg CACCCCTATA 
9131 CATCCTATCT GGaGGAATCA TCaGGGAAaG tttacgcgaa gccaccatcc gggaagaaca ttacgtacga gtgcaagtgc cgccattaca agaccgcaac cgttaccacc cctacccaaa 
7741 TCACGGGCTG CACCGCCATC AAGCAGTGCG TCGCCTATAA GaGCCACCAA ACGAAGTCGG TCTTCAACTC GCCGCACTCG ATCAGaCACG CCGACCaCaC CGCCCAAGGG AAATTCCATT 
7361 TCCCTTTCAA GCTCATCCCC actacctgca TGGTCCCTCT TCCCCACGCC CCCAACCTAC TaCACGCCTT TAAaCACATC aGCCTCCAAT TAGACACAGA CCATCTCaCA ttgctcacca 
9*«rCCAGGAGACT AGGGGCAAAC cccgaaccaa ccactcaatc gatcatcgga AACACGCTTA GAAaCTTCAC cctccaccca GaTCGCCTGG AATACATATG gcgcaatcac gaaccactaa 
9601 CGGTCTATGC CCAAGAGTCT CCACCACCAG ACCCTCACGG ATGGCCACAC GAAATAGTAC AGCATTACTA TCATCGCCAT CCTGTGTACA CCATCTTaGC CGTCGCATCA GCT GC TGT G G 
9711 CCATCATCAT tcgcctaact CTTCCAGCAT TaTGTGCCTG TAAAGCGCGC CGTCACTGCC TCACGCCATa TGCCCTGCCC CCaAATGCCG tgattccaac ttccctcgca cttttgtgct 
9*41 GTGTTACGTC CCCTAATCCT GAAACATTCA CCCAGACCAT CAGTTACTTA TCGTCGAACA GCCAGCCCTT CTTCTGCGTC cagctgtgta TaCCTCT G GC coctctcgtc GTTCTAATGC 
«6l GCTCTTCCTC ATCCTCCCTC CCTTTTTTAG TCCTTGCCGG CCCCTACCTG GCCAAGGTAG aCGCCTACCA aCATCCCACC ACTCTTCCAA ATCTCCCACA GaTaCCCTAT AACGCACTTG 
IQOtl TTGAAACGGC A GGGTAC CCC CCGCTCAATT TGCACATTAC TGTCATGTCC TCGGAGGTTT TCCCTTCCAC CAaCCAAGaG TACATTACCT GCAAATTCAC CACTGTGGTC CCCTCCCCTA 
10201 AAGTCACATG CTGCGCCTCC TTCCAATGTC agccccccgc tcacgcagac tatacctgca AGCTCTTTGG aggggtgtac CCCTTCATGT GCGCAGGAGC ACAATCTTTT TGCCACACTC 
ICDlt ACAACACCCA GATCAGTGaG CCCTACCTCG AATTCTCAGT AGaTTCCCCG ACTCACCaCG CCCAGGCGAT TaaGGTGCAT ACTGCCGCCA tgaaagtagg ACTGCGTATA GTGTACGGGA 
10*41 ACACTACCAG TTTCCTACAT CTCTACCTCA ACCCaCTCAC ACCAGGAACG TCTAAACACC TGAAACTCAT AGCTGGACCA ATTTCAGCAT TCTTTACACC ATTCGATCAC AAGGTCCTTA 
10S6I TCAATCCeCG CCTGGTCTaC AACTATGACT TTCCGCAATA CGGAGCGATG AAACCAGGaG CGTTTGGaGA CaTTCAAGCT ACCTCCTTCA CTAGCAAAGA CCTCATCCCC AGCACAGACA 
I0«1 TTAGGCTACT CaaGCCTTCC CCCAaGAACG TCCATGTCCC GTACACGCAG GCCGCATCTC CATTCGAGAT CTCGAAAAAC AACTCAGGCC GCCCACTGCA GGAAACCGCC ccttttgggt 
lOJOt gcaagattgc agtcaatccc cttcgagcgg tgcactcctc atacgggaac attcccattt ctattcacat cccgaacgct gcctttatca CCaCATCAGA tgcaccactg GTCTCAACAG 
torn tcaaatgtca tctcagtcac tgcacttatt cagcgcactt cggagggatc gctaccctcc agtatctatc ccaccgccaa gcacaatgcc ctctacattc gcattccagc acagcaaccc 
110*1 TCCAACAGTC GACAGTTCAT GTCCTGGaGA AACCAGCGCT cacagtacac ttcagcaccg ccaccccaca ggcgaacttc ATTGTATCGC TGTCTCGTAA GAAGACAACA tgcaatgcag 
1 1 161 AATCCAAACC ACCACCTCAT CaTaTCGTCA GCACCCCGCA CAAAAATCAC CaaGAATTCC AAGCCGCCAT CTCAAAAACT TCATCGACTT CGCTGTTTGC CCTTTTCGGC GGCGCCTCCT 
11131 CGCTATTAAT TaTaCCaCTT ATGATTTTTC CTTCCAGCAT CaTGCTGaCT aGCaCaCGaa GATGACCGCT ACGCCCCAAT GACCCCACCA GCAAAaCTCG ATCTaCTTCC CAGGAACTCA 
1 1*01 TCTCCATAAT CCATCACCCT CCTATATTaG ATCCCCGCTT ACCGCGGGCA ATATAGCAaC ACCAAAACTC GACCTATTTC CGAGGAAGCG CACTGCATAA TCCTGCGCAG tcttcccaaa 

i tut taatcactat attaaccatt tattcacccc accccaaaac tcaatctatt tctcaggaac catcctccat aatgccatgc accctctcca taacttttta ttatttcttt tattaatcaa 

116*1 CAAAAl 1 I ilj I ; I I I AACAT ttc 
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Nucleotide Sequence of TR339 

I ATTCGCGOCG TAGTACACAC TATTCAATCA AACAGCCGaC CAATTOCACT ACCATCACAA TOCACAAOCC ACTACTAAAC CTaCACGTAG ACCCCCACAC TCCC7TTTCTC CTCCAACTGC 
m AAAAAAGCTT CCCGCAATTT CACCTACTAC CACaCCACCT CACTCCAAAT CACCATCCTA ATGCCAGAGC ATTTTCCCAT CTCGCCACTA AACTAATCGA GCTGGAGGTT CCTACCACAC 

241 CCACCATCTT CCaCaTaCCC accgcacccc ctcgtagaat cttttcccac caccagtatc attctgtctc ccccatccct actccacaac acccccaccc catcatcaaa tatcccacta 
361 AACTXJCCGCA AAAAGCCTCC aagattacaa aCAAGAACTT ccatcagaac attaaggatc tccggaccgt acttcatacg ccccatcctc aaacaccatc ocictccttt cacaacgatg 
«i ttacctccaa catgcgtgcc gaatattccg tcatocacca cctctatatc aacgctcccg caactatcta tcatcaccct atcaaacccg tccccaccct gtactgcatt cccttccaCa 
ttl CCACCCACTT CATCTTCTCC CCTATCCCAG CTTCCTaCCC tccctacaac accaactccc cccaccacaa actccttgaa ccccctaaca tcggactttg cagcacaaag ctcactgaag 
tii CTACCACAOC aaaattctcc ataatcacca acaaccactt caaccccccc tcgcgccttt atttcpccct acgatccaca ctttatccac aacacacacc caccttccac acctggcatc 
Ml TTCCATCCGT CTTCCACTTG aatccaaacc actcctacac ttgccgctgt catacactcc tcacttccca accctaccta gtgaagaaaa tcaccatcac tccccggatc accccacaaa 
«l CCGTCGCATA ccccgttaca cacaataccc accccttctt cctatccaaa gttactgaca cactaaaacc acaacgggta tcgttccctg tctccacgta catccccgcc accatatgcc 
101 i atcagatgac tcctataatc gccacccata TATCACCTCA CGATGCACAA AAACTTCTCG ttcccctcaa CCACCCAATT GTCATTAACG CTAGGACTAA CAGGAACACC aacaccatgc 
not AAAATTACCT TCTCCCGATC atagcacaag ggttcagcaa atcggctaag gagcgcaagg ATGATCTTGA TAACGAGAAA ATGCTCGCTA ctagacaacg CAAGCTTACG TATGGCTCCT 
13JI TCTCGGCGTT TCGCACTAAG AAAGTACATT CCTTTTATCG CCCACCTCCA ACGCAGACCA TCGTAAAAGT CCCaGCCTCT TTTAGCCCTT TTCCCATGTC CTCCCTATGG ACCACCTCTT 

i«i tgcccatgtc gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac tccigcaggt ctcggaggaa ttagtcatgg aggccaaggc tccttttcag catcctcagg 
1561 AGCAAGCCAG AGCGGAGAAG CTCCGaGAAG cacttccacc ATTaGTCCCA gacaaaggca tcgaggcagc cgcagaagtt ctctgccaag tggaggggct ccaggcgcac atcggagcag 
i6ii cattagttga aaccccgcgc cgtcacctaa ggataatacc tcaagcaaat caccctatga tcgcacagta tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 
ibi cccaccccct agcagatcac gttaagatca taacacactc ccgtagatca ggaacgtacc ccctccaacc atacgacgct aaagtactca tgccagcagg aggtgccgta ccatggccag 
1911 AATTCCTAGC actcagtgag agccccacct tactctacaa cgaaagagac tttgtgaacc gcaaactata ccacattgcc atccatcgcc cccccaacaa tacagaagag gagcagtaca 
:o*i aggttacaaa ggcagagctt gcagaaacag actacgtgtt tgacgtcgac aagaagcgtt gcgttaagaa cgaacaagcc tcaggtctgg tcctctcggg agaactcacc aaccctccct 
:i6i atcatgagct agctctgcag ggactgaaga cccgacctgc cgtcccgtac aaggtccaaa caataggagt gataggcaca ccccggtccg gcaagtcagc tattatcaag tcaactctca 
=ii cggcacggca tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggcccacg tcctaacact gaggggtatg cagattacgt cgaagacact agattccctt atgctcaacg 
Mil GATCCCACAA AGCCGTAGAA gtgctgtacg ttcacgaagc gttcgcctgc cacgcaggag cactacttgc cttgattgct ATCGTCAGGC CCCGCAAGAA cot act acta tgcggagacc 

Xl\ CCATCCAATC CCCATTCTTC AACATGATGC AACTAAAGGT ACATTTCAAT CACCCTGAAA AaCACATATG CACCAAGACA TTCTACAAGT ATATCTCCCG CCCTTCCACA CAGCCAGTTA 
CAGCTATTCT ATCCACACTG CATTACGATG GAAACATCaA AACCACCAAC CCGTGCAAGA AGAACATTGA AATCGATATT ACAGGGGCCA CAAACCCGAA GCCAGGGGAT ATCATCCTCA 

::41 CATGTTTCCG CGCCTGCGTT aaGCAATTGC aaatccacta tcccggacat caagtaatca cagccgccgc ctcacaaggg CTAACCAGAA aaGGAGTGTA TGCCGTCCGG CAAAAAGTCA 
IMl ATGAAAACCC ACTCTACGCG ATCACATCAG aGCATGTGAA CCTGTTGCTC aCCCCCACTG AGGACACGCT actctccaaa aCCTTGCAGG gcgacccatg gattaagcag ctcactaaca 
jooi tacctaaagg aaactttcag gctactatag aggactggga agctgaacac aagggaataa ttgctgcaat aaacagcccc actccccctc ccaatccgtt cagctccaag accaacgttt 
jut gctgggcgaa agcattcgaa ccgatactag CCACGGCCCG TATCGTACTT ACCCCTTCCC agtcgagcga actcttccca cagtttcccg atgacaaacc acattccgcc ATTTACCCCT 
3:*i tacacgtaat ttgcattaag tttttcggca tccacttgac aagcggactg ttttctaaac agagcatccc actaacctac catcccgccc attcagccag gccggtagct cattcccaca 
)»l ACACCCCAGG AACCCGCAAG tatggctacg atcacgccattcccgcccaa ctctccccta gatttccggt CTTCCAGCTA GCTGGGAAGG GCACACAACT TCATTTOCAG ACGCGGACAA 
jai ccagacttat ctctccacac cataacctgg tcccggtgaa ccccaatcttcctcacgccttagtccccga gtacaaggag aagcaacccg gcccgctcga aaaattcttg aaccagttca 

3601 AACACCACTC ACTACTTCTG GTATCAGAGG AAAAAATTCA AGCTCCCCCT AAGAGAATCC AATGGATCGC CCCGATTGGC ATaGCCGGTG CAGATAaCAA CTACAACCTG GCTTTCGGGT 
3721 TTCCCCCCCA CCCACGCTAC GaCCTGGTCT TCaTCAACAT TCGAACTAAA TACAGAAACC accactttca CCaGTCCCAA CACCATCCGG CGACCTTAAA AACCCTTTCG CCTTCGGCCC 
3»*l TGAATTGCCT TAACCCAGCA GGCACCCTCC TCCTGAAGTC CTATGGCTAC GCCCACCGCA ACACTGAGCA cctactcacc cctcttccca caaagtttct CAGGGTGTCC GCACCCACAC 
3961 CACATTGTCT CTCAAGCAAT ACAGAAATCT ACCTCATTTT CCCACAACTA GACAACAGCC GTACACGGCA ATTCACCCCG CACCATCTCA ATTOCCTCaT TTCCTCCCTG TATGAGOGTA 

*oii caagagatgg agttggagcc gcgccctcat accgcaccaa aagggagaat att gc tgact gtcaagagga accacttctc aacgcagcca atccgctgcg tagaccagcc caagcactct 
«0I GCCGTGCCAT CTATAAACG7 TGGCCGACCA GTTTTACCCA TTCaGCCaCG GAGaCAGGCA ccgcaagaat gactct g tgc ctacgaaaga aactgatcca cgcgctcggc cctgatttcc 
mi ggaagcaccc acaagcacaa gccttcaaat tgctacaaaa cgcctaccat gcagtggcag acttactaaa tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 
4441 acgcagcccg aaaagacccc cttgaagtat cacttaactc cttcacaacc gcgctagaca gaactgacgc ccacgtaacc atctattccc tcgataagaa gtggaaggaa agaatcgacg 
*S61 ccccactcca acttaaccag tctctaacag acctcaagca tgaacatatc gagatccacg atcagttact atcgatccat ccagacactt gcttgaaggg aagaaaggca ttcagtacta 
•mi caaaaccaaa attgtattcg tacttcgaag ccaccaaatt ccatcaagca ccaaaagaca tgcccgagat aaaggtcctc ttccctaatg accacgaaag taatcaacaa ctutut g cct 
*»i acatattcgg tcagaccatc caagcaatcc gccaaaagtc cccgctcgac cataacccct cgtctacccc ccccaaaacg ttcccctccc tttgcatcta tcccatcacc ccacaaaggg 
+711 TCCACAGACT tagaagcaat aacgtcaaag aagttacagt ATGCTCCTCC accccccttc ctaagcacaa aattaagaat gttcacaagg ttcagtccac caaagtagtc ctctttaatc 
J0*t cgcacactcc cccattcctt cccgcccgta agtacataga agtgccagaa cagcctaccg ctcctcctgc acacgccgag caggcccccc aagttgtagc cacaccgtca CCATCTACAG 
si6i ctcataacac ctcccttcat gtcacagaca tctcactcga tatcgatcac agtagccaag gctcactttt ttccagcttt agcggatcgg acaactctat tactactatg gacagttcgt 
nn cgtcaggacc tacttcacta cacatagtag acccaacgca cgtggtgctc cctcaccttc atgccgtcca agaccctgcc cctattccac cgccaaggct aaagaagatg ccccccctcg 
J*QI CAGCCCCAAC AAAAGACCCC actccacccg caaccaatag ctctcactcc ctccacctct cttttcgtcg cctatccatg tccctcgcat caattttcca cggagacacg cccccccagc 
ijii cagcggtaca acccctcgca acaggcccca cgcatctgcc TATCTCTTTC CCATCGTTTT CCCACCCACA CaTTCATCAG ctgagcccca CAGTAACTCA gtccgaaccc ctcctctttc 
SMI GATCATTTGA ACCGCGCGAA GTGAACTCAA ttatatcctc CCGATCAGCC ctatcttttc cactacccaa ccacacacgt ACACGCAGCA ccagcacgac TCAATACTCA ctaaccgccg 
mi taggtgggta CATATTTTCC acggacacag gccctcggca CTTGCAAAAG aagtcccttc TCCAGAACCA ccttacagaa cccaccttcg ACCGCAATCT CCTCGAAACA attcatcccc 
Jui ccctcctcca cacctcgaaa gaggaacaac tcaaaCTCag gtaccacatg atccccaccg aacccaacaa aactaggtac cactctccta aagtacaaaa tcagaaaccC ataaccaCTG 
MOI AGCCACTACT CTCAGGaCTa CGaCTCTaTA actctcccac agatcagcca gaatgctata agatcaccta tccgaaacca ttctactcca gtagcgtacc cgcgaactac tccgatccac 
6111 AGTTCCCTGT ACCTCTCTCT aacaactatc tgcatgagaa ctatccgaca gtagcatctt ATCAGATTAC tgacgagtac gatgcttact tggatatcct agacgcgaca gtcccctccc 

«2tl TCCATACTCC AACCTTCTGC CCCCCTAAGC TTAGAACTTA CCCGaaaaaa CATCAGTATA GAGCCCCCAA TATCCCCaGT GCCCTTCCAT CAGCGATCCA CAACACCCTA CAAAATCTCC 
6361 TCATTCCCCC AACTaaaaGA AATTCCaaCG TCACCCaGAT GCGTCAACTG CCaaCACTGG actcagcgac attcaatgtc caatcctttc CAAAATATCC ATCTAATCAC CAGTATTCCC 
6*11 AGCAGTTCCC TCCCAACCCA ATTACCATTa CCACTCaGTT TCTCaCCCCA TATGTACCTA GACTGAAAGG CCCTAAGGCC CCCCCACTAT TTCCAAACAC GTATAATTPG CTCCCATTCC 

640i aagaagtgcc tatccataga ttcctcatgg acatcaaaag acacgtcaaa cttacaccac ccacgaaaca cacagaagaa agacccaaaG tacaaCTCat acaagcccca gaacccctcc 
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6711 CCACTCCTTA CTTATCCCCC ATTCACCGGC AATTAGTCCG TACCCTTACC CCC CtL I IOC TTCCAAACaT TCACACCCTT TTTCACATCT CCCCCCACCA TTTTGATOCA ATCATAGCAG 
6641 AACACTTCAA CCAACCCCAC CCGCTACTGG ACACCCATAT cgcatcattc CACAAAACCC aacacgaccc TATCOCCTTA ACCGGTCTOA tcatcttcca ccacctcgct gtggatcaac 

6»i CACTACTCCA cttcatccag tocgcctttc cacaaatatc atccacccat ctacctaccc ctactcgttt taaattcogc cccatcatca aatccocaat cttcctcaca errrrttncA 

7WI ACACACTTTT CAATCTCCTT ATCCCCACCA CaCTACTaCA acaccccctt aaaacctcca gatgtgcagc cttcattooc caccacaaca TCATACaTCG AGTAGTATCT CACAAAGAAA 
7701 TCCCTCACAC CTCCCCCACC TCOCTCAACA TCCACCTTAA CATCATCCAC CCACTCATCC CTCaCACACC ACCTTACTTC TGCGGCGGAT TTATCTTCCA ACATTCCCTT ACTTCCACaG 
732! CGTGCCGCGT CCCCCACCCC CTCAAAACCC TCnTAACTT GGGTAAACCG CTCCCAGCCG ACGaCCACCA aCACCAACaC AGAAGACGCG CTCTCCTACa TCAAACAAAO GCGTCGTTTA 

7*41 GACTAGCTAT aacacccact ttagcagtgg ccgtcaccac cccctatcac gtacacaata ttacacctct cctactcgca ttcacaactt TTCCCCACAG CAAAAGAGCA ttccaagcca 
7561 tcagagggga aataaagcat ctutacggtg ctcctaaata ctcagcatag tacatttcat ctgactaata ctacaacacc accaccatga ATAGAGGATT ctttaacatg ctcggccgcc 

7611 GCCCCTTCCC CGCCCCCACT GCCATCTCGA GGCCGCCCAG AAGGAGGCAG GCGGCCCCGA TCCCTCCCCC CAACGCGCTG GCTTCTCAAA TCCAGCAACT GACCACAGCC GTCAGTCCCC 
780! TAGTCATTGG ACAGGCAACT ACACCTCAAC CCCCACGTCC ACGCCCGCCA CCGCGCCAGA AGAAGCAGGC GCCCAAGCAA CCACCGAAGC CGAAGAAACC AAAAACGCAG CAGAAGAAGA 
7721 AGAAGCAACC TGCAAAACCC AAACCCGGAA AGaGaCAGCG CATGGCACTT AAGTTOGAGG CCCACAGATT GTrCGACCTC AAGAACGAGG ACGGAGATGT catccgccac GCACTGGCCA 

km i tgcaaggaaa gctaatcaaa cctctccacg tcaaaggaac catcgaccac cctctcctat caaagctcaa atttaccaag tcgtcagcat accacatcca gttcgcacag ttcccactca 

•161 ACATGAGAAG TCAGCCATTC ACCTACACCA CTCAACACCC CCAAGGATTC TATAACTGGC ACCACGCAGC GCTGCaGTAT AGTGGAGGTA GATTTACCAT CCCTCGCGCA CTAGGAGCCA 

mi gaggacacag cgg tc ctccg atcatcgata actccgctcg gcttgtcgcg atagtcctcc ctcgagctca tgaaggaaca ccaactgccc tttcggtcct cacctcgaat agtaaaggca 

S40I AGACAATTAA GACGACCCCG GAAGGGACAG AAGAGTGGTC CCCAGCACCA CTGGTCACGG CAATGTCTTT GCTCGCAAAT GTGAGCTTCC CATGCCACCG CCCGCCCACA TCCTATACCC 
&SII CCCAACCrrC CAGACCCCTC GACATCCTTG AAGaGAACGT GAACCATGAG gcctacgata ccctg c tcaa tgccatattg CGGTGCGCAT CGTCT G OCAG AAGCAAAAGA AGCGTCACTG 
6641 ACGACTTTAC CCTGACCAGC CCCTACTTGG GCACATGCTC ctactgccac catactgaac cgtccttcag ccctgttaac ATCGAGCAGG TCTGGGACGA AGCGGACGAT aacaccatac 
1761 CCATACAGAC ttccgcccag tttggatacg accaaagcgg AGCAGCAAGC GCAAACAAGT ACCGCTACAT GTCGCTTCAG CAGCATCACA CCGTTAAAGA AGGCACCATG GATGACATCA 
Ull AGATTAGCAC CTCAGCACCG TGTaGAAGGC TTAGCTACAA AGGATACTTT CTCCTCGCAA AATCCCCPCC AGGGGACAGC GTAACCCTTA GCATAGTCAG tagcaactca gcaacctcat 

mi gtacactggc ccgcaagata aaaccaaaat tcgtgggacg ggaaaaatat gatctacctc ccgttcacgg taaaaaaatt ccttgcacag tgtacgaccg tctgaaagaa acaactgcag 
fllll gctacatcac tatgcacagg ccgggaccgc ACGCTTATaC ATCCTACCTG GAAGAATCAT CAGGGAAAGT TTACGCAAAG ccgccatctu ggaagaacat tacgtatgag tccaagtccg 
91*1 GCCACTACAA caccgcaacc gtttcgaccc gcaccgaaat cactggttcc accgccatca agcagtccgt cccctataag agccaccaaa ccaagtgcct cttcaactca ccgcacttca 
9361 TCAGACATGA CGACCACACG GCCCAAGGGA AATTGCATTT gcctttcaag ttoatcccga gtacctgcat ggtccctgtt gcccacgcgc ccaatgtaat acatggcttt AAACACATCA 
9«*t CCCTCCAATT AGATACAGAC cacttcacat tgctcaccac caggagacta ggggcaaacc cgcaaccaac cactgaatgg ATCGTCGGAA agacggtcag AAACTTCACC gtccaccgag 
«ot atcgcctgga atacatatgg ggaaatcatg agccagtsag ggtctatgcc caagagtcag caccaggaga ccctcacgga tggccacacg aaatagtaca gcattactac catcgccatc 
9711 CTGTGTACAC CATCTTAGCC CTCCCATCAG ctaccgtgcc GATCATGATT cccgtaaccg ttgca c tctt ATCTCCCTCT AAAGCGCGCC gtgagtgcct gacgccatac cccctcgccc 
9641 CAAACGCCGT AATCCCAACT tcgctggcac tcttgtgctg cgttaggtcg cccaatgctg AAACCTTCAC cgacaccatg agttacttgt ggtcgaacag tcaccccttc ttcto g gtcc 
9961 AGTTCTGCAT acctttggcc gctttcatcg TTUTAATCCG CTGCTCCTCC TCCTCCCTGC CI 1 1 1 1 1 ACT ggttgccggc gcctacctgg cgaaggtaga CGCCTACGAA CATGCCACCA 
100*1 CTGTTCCAAA TGTGCCACAG ATACCCTaTA AGGCaCTTGT TCAAAGGGCA cggtatoccc cgctcaattt GGACATCACT GTCATGTCCT CGGAGCTTTT CCCTTCCACC AACCAAGAGT 
10201 ACATTACCTG CAAATTCACC ACTCTGGTCC CCTCCCCAAA AATCAAATGC TCCCGCTCCT TGGAATCTCA GCCGGCCCCT CATCCAGACT ATACCTCCAA GGTCTPCGGA GGCCTCTACC 
L0JH CCTTTATCTG GGGAGGAGCG CAATGTTTTT GCCACAGTGA GAACAGCCAG ATGAGTGAGG CCTACGTCGA ACTGTCAGCA CATTCCGCGT CTCACCACGC GCAGGCGATT AAGGTGCACA 
10*41 CTGCCGCCAT GAAAGTAGCA CTGCGTATAG TGTACGGG.VA CACTACCACT TTCCTACaTG TCTACGTGAA CGGAGTCACA CCACGAACGT CTAAAGACTT CAAAGTCATA GCTGCACCAA 
toaai TTTCAGCATC GTTTACGCCA ttcgatcata AGGTCGTTaT CCATCGCGGC CTGCTGTACa ACTATGACTT CCCGGAATAT GGaGCGATGa AACCAGGAGC gtttggagac ATTCAAGCTA 
LOSII CCTCCTTCAC TAGCAAGGAT CTCATCGCCA GCACaGaCAT TACCCTACTC AAGCCTTCCG CCAAGAACGT GCATGTCCCG TACACGCAGG CCGCATCAGC ATTTGAGATG TCGAAAAACA 
I OKI ACTCAGGCCG CCCACTCCAG GAAACCGCAC CTTTCCGGTC TAACATTGCA GTAAATCCCC TCCCAGCGCT GGaCTCTTCA TACGGGAACA TTCCCaTTTC TATTGACATC CCCAACCCTG 
tmi CCTTTATXAG GACATCAGAT GCACCACTCG TCTCAACAGT CAAATGTGAA CTCACTGAGT CCACTTA7TC AGCAGACTTC GGCGGGATGG CCACCCTCCA GTATGTATCC CACCGCGAAG 
1 1041 GTCAATGCCC CCTACATTCG CATTCGAGCA CAGCAACTCT CCAAGAGTCG ACaGTACATG TCCTGGAGAA aGGAGCGCTG ACAGTaCACT TTAGCACCGC CAGTCCACAG GCCAACTTTA 

Mi6i tcgtatccct gtgtgggaag aagacaacat ccaatgcaga atctaaacca ccagctgacc atatcgtgag caccccgcac aaaaatgacc aacaatttca acccgccatc tcaaaaacat 
(1281 CATGGACTTG cctgtttgcc cttttcggcg GCGCCTCCTC CCTATTAATT ATAGGACTTA TCATTTTTCC TTCCAGCATG ATGCTGACTA GCACACGAAG ATCACCGCTA ccccccaatg 
11401 ATCCGACCAG caaaactcca tgtacttccg aggaactsat gtgcataatg catcaggctc ctacattaga tcccccctta ccccgcgcaa tataccaaca ctaaaaactc catgtacttc 
US21 cgaggaagcg cagtgcataa tcctgcgcag tuttgccaca taaccactat attaaccatt tatctagcgg acgccaaaaa ctcaatgtat ttctgagcaa gcgtggtcca taatgccacg 
1 1641 CAGCCTCTCC ataactttta ttatttxttt tattaatcaa caaaattttg tttttaacat ttc 
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